comses / miracle

Repeatable data analysis workflows for computational models
1 stars 3 forks source link

WIP: added analysis metadata extraction for .7z and .zip archives #11

Closed cpritcha closed 9 years ago

cpritcha commented 9 years ago

Refactored metadata. Added extractors.

landscape-bot commented 9 years ago

Code Health Repository health increased by 0.34% when pulling 2507620 on cpritcha:master into 4d4607f on comses:master.

alee commented 9 years ago

Looks good overall! Some docs on the main collaborating classes and how they work together would be great if you could add those in. When someone uploads a file to our server is this a rough workflow for what we should do?

  1. Analyze the file to figure out what it is (could be a zipfile, or just a single dataset / script)
  2. Based on file analysis, either use the Extractor to pull apart a zipfile, and then Analyze all of its contents
  3. Once every file has been Analyzed, we have some logic based on its AnalysisMetadata to put the dataset / script in the right place based on its parent Project and store some of our extracted metadata in the RDBMS.

Note that we should probably introduce Celery and queueing and perform all of this asynchronously so the user isn't sitting and spinning on a webpage.