matsim-eth / SCCER-Pipeline

1 stars 0 forks source link

Automate full pipeline #28

Open tduberne opened 5 years ago

tduberne commented 5 years ago

At least the trip segmentation part will have to be in Python, so need a way to automate the succession of steps.

As a last resort, could be solved using a Makefile.

I am preparing a Luigi workflow somewhere else, and should know soon whether I like it or not. If so, might use it here as well. My impression is that it could allow to handle failures better (and is more powerful in terms of parameterizing the whole process, eg. for testing)

joemolloy commented 5 years ago

i would suggest that we just use a bash or python script to run the steps. Everyone knows how to use/read it, and there won't be many errors. if so, we just record the person/trip ids that causes errors and investigate them manually

tduberne commented 5 years ago

What's the problem with a Makefile?

My main motivation in using other tools is parameterization. For instance, have a config for testing (where e-mails go to a test e-mail address or just to a log file) and one for "production" (where actual people actually get things in their mailboxes). Or allow to name the DB extract for the day with the date in the name and run the full pipeline on this, with all intermediary file for that day properly named as well.

But I do not really care about the tool, and one can still change tool between pre-test and main study if it turns out to have problems. My main issue at the moment is how to integrate Alessio's code into the pipeline (it is not packaged in an easily re-usable way). Once I got this, stitching stuff together should be trivial.

joemolloy commented 5 years ago

ok sure. ive worked with makefiles before so that should be fine.