eecs485staff / madoop

A light weight MapReduce framework for education
MIT License
9 stars 4 forks source link

Verbose output #36

Closed awdeorio closed 2 years ago

awdeorio commented 2 years ago

Add -v/--verbose flag.

Closes #15

To validate, run with and without the verbose flag. Please give me feedback on the output.

Wait to review until #33 is merged.

codecov[bot] commented 2 years ago

Codecov Report

Merging #36 (ac22df3) into develop (cb1f7ea) will increase coverage by 1.89%. The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop      #36      +/-   ##
===========================================
+ Coverage    93.89%   95.78%   +1.89%     
===========================================
  Files            4        4              
  Lines          131      190      +59     
===========================================
+ Hits           123      182      +59     
  Misses           8        8              
Impacted Files Coverage Δ
madoop/__main__.py 90.62% <100.00%> (+4.26%) :arrow_up:
madoop/mapreduce.py 96.77% <100.00%> (+1.49%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update cb1f7ea...ac22df3. Read the comment docs.

awdeorio commented 2 years ago

This is now ready to go! @jaredzh

awdeorio commented 2 years ago

Great ideas @jaredzh! Take a look at the new version. In particular, the grouping stage debug output:

INFO: Starting group stage
DEBUG: mapper-output/part-00000 unique_keys=3
DEBUG: mapper-output/part-00001 unique_keys=3
DEBUG: mapper-output all_unique_keys=5
DEBUG: partition mapper-output/part-00000 >> reducer-input/{part-00000,part-00001,part-00002,part-00003}
DEBUG: partition mapper-output/part-00001 >> reducer-input/{part-00000,part-00001,part-00002,part-00003}
DEBUG: empty partition: rm reducer-input/part-00000
DEBUG: sort reducer-input/part-00001
DEBUG: sort reducer-input/part-00002
DEBUG: sort reducer-input/part-00003
DEBUG: reducer-input/part-00001 unique_keys=2
DEBUG: reducer-input/part-00002 unique_keys=1
DEBUG: reducer-input/part-00003 unique_keys=2
DEBUG: reducer-input all_unique_keys=5
jaredzh commented 2 years ago

LGTM. I made some changes that fixed a bug where the number of map executions outputted was off by 1 and used >> for the partition part for the map stage so that's consistent with the group stage. Go ahead and merge if it's good. @awdeorio