Closed csarven closed 9 years ago
Okey, do you mean having the download in doingbusiness.get.sh and the steps "convert to csv" and "remove html version" in doingbusiness.preprocessing.sh?
Or having the download in doingbusiness.get.sh, the step "convert to csv" in something like doingbusiness.convert.sh and the removal in something like doingbusiness.removeHTML.sh?
Either is okay. Perhaps doingbusiness.preprocessing.sh is sufficient as there aren't too many complicated steps.
Converting has been moved to the preprocessing.
Decouple different processes in the same script e.g.,: https://github.com/csarven/doingbusiness-linked-data/commit/55812a66cca2537da253566820d9240b2959daf0 introduced two other steps: "converte to csv" and "remove html version". They should go into their own respective scripts dealing with preprocessing.
Leave the retrieved raw data alone and keep it in the system so that the the workflow can be run without having to run the retrieval step again.