HazyResearch / bazaar

14 stars 11 forks source link

Keep local server register of started & completed segments #13

Closed ajratner closed 9 years ago

ajratner commented 9 years ago

Simplest / quickest way I can think of doing- parser writes a .reg file which is "0" when started and running and "1" when completed. fab can simply collect all the .reg files- super simple- and then come up with register of global status

ajratner commented 9 years ago

@chrismre Have to run to dinner with Will (timeliness!) but quick rundown of process (bold is what I'm adding):

  1. Files split locally into segments
  2. Segments are distributed across servers / cores
  3. When a core picks up a segment, it writes "0" to seg_id.reg in addition to writing output to seg_id.parsed
  4. When a core completes a segment it writes "1" to seg_id.reg
  5. *Global status can be checked at any time via a simple 'collect' operation over `.reginfab`**
  6. At the end, the *.parsed files are collected and catted to get full output...

Any thoughts? This just seemed quick and simple to implement given the current setup

@raphaelhoffmann any thoughts?

raphaelhoffmann commented 9 years ago

That is really cool! I like that you can do a global status check with a simple command. Thanks for building this.

On Thu, Aug 6, 2015 at 5:57 PM, Alex Ratner notifications@github.com wrote:

@chrismre https://github.com/chrismre Have to run to dinner with Will (timeliness!) but quick rundown of process:

  1. Files split locally into segments
  2. Segments are distributed across servers / cores
  3. When a core picks up a segment, it writes "0" to seg_id.reg in addition to writing output to seg_id.parsed
  4. When a core completes a segment it writes "1" to seg_id.reg
  5. Global status can be checked at any time via a simple 'collect' operation over *.reg in fab
  6. At the end, the *.parsed files are collected and catted to get full output...

Any thoughts? This just seemed quick and simple to implement given the current setup

— Reply to this email directly or view it on GitHub https://github.com/HazyResearch/bazaar/issues/13#issuecomment-128551840.

ajratner commented 9 years ago

@raphaelhoffmann Just need to merge my jsonreader-multi-key branch into master, or the relevant part of it, for this to work