Due to this bug https://github.com/akvo/akvo-flow-mobile/issues/614 some data.json files have an incorrect formId. For example: "formId":"Charity_Water_Combined_Community". In that case the data will never be imported correctly.
In order to fix that I can see 2 possible approches:
1) Connect to amazon and iterate over all the folders inside the corresponding folder, unzip check that the content is correct and if the formId is not a Long but a string, we know we have to get it using the questions and write it inside the file.
Documentation from Amazon: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html
Examples: http://stackoverflow.com/questions/3337912/quick-way-to-list-all-files-in-amazon-s3-bucket
2) Use ProcessorServlet. At this point we can detect that the formId is a string and not a long and start a different Task to parse the data and fix the missing formId by using the same TaskServlet but with an extra option to fix the formId.
For me option 2 seams more achievable but maybe option 1 is cleaner. The problem is, how to make sure to run the task before GAE code starts processing the files?
Due to this bug https://github.com/akvo/akvo-flow-mobile/issues/614 some data.json files have an incorrect formId. For example:
"formId":"Charity_Water_Combined_Community"
. In that case the data will never be imported correctly.In order to fix that I can see 2 possible approches: 1) Connect to amazon and iterate over all the folders inside the corresponding folder, unzip check that the content is correct and if the formId is not a Long but a string, we know we have to get it using the questions and write it inside the file. Documentation from Amazon: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html Examples: http://stackoverflow.com/questions/3337912/quick-way-to-list-all-files-in-amazon-s3-bucket 2) Use ProcessorServlet. At this point we can detect that the formId is a string and not a long and start a different Task to parse the data and fix the missing formId by using the same TaskServlet but with an extra option to fix the formId.
For me option 2 seams more achievable but maybe option 1 is cleaner. The problem is, how to make sure to run the task before GAE code starts processing the files?