akvo / akvo-flow

A data collection and monitoring tool that works anywhere.
http://akvo.org/products/akvoflow/
GNU Affero General Public License v3.0
65 stars 31 forks source link

Check the quality of counts management #741

Closed mtwestra closed 8 years ago

mtwestra commented 10 years ago

The system keeps a running statistics count of answers. Because there are multiple ways the data can change, we need to stress test the system using the different pathways of data change.

Test plan:

  1. create a survey with some 3 different option questions, at least one of which has an 'other' field, and at least one of which has 'multiple allowed'
  2. collect data (~10) with this app with the phone online, and check counts (QuestionAnswerSummary objects).
  3. collect data (~10) with this app with the phone offline, then go online and let the phone send data, and check counts
  4. modify data on the dashboard, and check counts
  5. export the data to excel, make changes to the data, and upload again. Check counts
  6. create a lot of extra data (~2000 rows) in excel by adding extra rows, and upload. check counts
  7. collect data on the phone in offline mode, and use bulk upload to upload. If you want, you can create extra fake data in the data.txt file, so as to stress the system extra. Check counts.
  8. delete data using the deleteSurveyResponses test harness. Check counts.

If in between tests the counts fail, you can recompute the counts first (using the testharness rebuildQuestionSummary method) and continue with another step, so we can identify where the problem is.

rumca commented 10 years ago

Update on counts testing.

Completed passing test cases:

Outstanding test cases (to be completed with 1.7.4 testing by 24/09/14):

Note: I am assuming that the cluster counts should be a separate thing - I've regularly experienced issues with this form of counting not working correctly (see #644 for further details)

rumca commented 10 years ago

**TL;DR

  1. Batch transmissions of responses with multiple option questions occasionally don't update counts correctly - fixed by test harness call.
  2. Data cleaning updates do not update the counts correctly - not fixed by test harness call.
  3. Rogue data is some times left even after all responses deleted - not fixed by test harness call.**

@muloem @mtwestra apologies in advance for the long comment but having spent a lot of time testing this here are my most recent findings.

From more exhaustive testing it seems the only scenarios where counts (i.e. the count on the SurveyQuestionSummary kind in the data store) are 100% always being updated correctly are points 2, 4, and 7. These are:

All other scenarios in this test plan seem to either work most times (but not always), or else they consistently result in incorrect counts. I'm not sure why we didn't see the same problems when testing #651 - perhaps it has something to do with more than one option question being used this time.

In the situation where a user submits a batch responses offline and then these are sent en masse to GAE once they are back online - the counts are updated correctly some times but not others. In cases where the counts were not updating as I expected, then a call to the rebuildQuestionSummary method in the test harness was sufficient to alter the counts to what I expected them to be.

Exporting the data to excel, making some changes to this, and then re-importing it never seems to result in the count incrementing as I would expect. For example in the following instance (Data store across the top, chart on left, raw data report on right):

testharness fail

I have made changes to the spreadsheet to increase the count of responses '1' from 13 -> 14. These changes have successfully been uploaded and when I generate a raw data report I can see that there are now 14 responses with the option '1'. However the count on the relevant SurveyQuestionSummary entity is never adjusted even when the test harness call is used.

Another observation I made during testing (which may or may not be related to the issues above - I'm not sure). In some situations where I deleted all of the responses from the dashboard the SurveyQuestionSummary kind counts were not being updated correctly. For example the below chart is generated from no actual data in the system:

a chart from nothing

Calls to the test harness don't correct this, the only thing that fixes it is manually deleting from the back end. Even after manually deleting the data (and flushing the cache) this rogue data seems to reappear from somewhere when you bulk upload. Just to be clear - with situations where all data can be deleted from the dashboard fine, and there are no extra counts hanging around without there being any data responsible from them - then Bulk upload is one of the data entry scenarios which works fine on the count front.

I still haven't had time to test one of the scenarios mentioned in the test plan ("If you want, you can create extra fake data in the data.txt file, so as to stress the system extra") so have only been using a bulk upload of ~30 responses. I'll get the details of this method from @ichinaski and hopefully test that part also. Apologies it took so long - it's been busy as of late!

janagombitova commented 8 years ago

Closing down this issue https://github.com/akvo/akvo-flow/issues/1414