getodk / aggregate

ODK Aggregate is a Java server that stores, analyzes, and presents survey data collected using ODK Collect. Contribute and make the world a better place! ✨🗄✨
https://docs.opendatakit.org/aggregate-intro/
Other
74 stars 227 forks source link

Infinite refresh caused by data store corruption #114

Open lognaturel opened 7 years ago

lognaturel commented 7 years ago

Aggregate can occasionally get into a state where it refreshes continuously. This is a known problem and a manual fix is documented at https://github.com/opendatakit/opendatakit/wiki/Aggregate-AppEngine-Troubleshooting#repairing-your-app-engine-database-yourself

This is relatively frequently reported: https://forum.opendatakit.org/t/infinite-refresh-bug/10284, https://forum.opendatakit.org/t/upload-of-aggregate-server-page-jumps/9438, https://forum.opendatakit.org/t/odk-aggregate-keeps-refreshing/6127 for example. Presumably it often goes unreported.

For blank form duplication, the problem happens immediately and the server keeps refreshing.

For submission duplication, the problem shows up when exporting to CSV and Aggregate tries to pull together that corrupt submission.

For both errors, users might sometimes see a the bad SQL query in bright red show up at the top of their screen.

At https://forum.opendatakit.org/t/odk-aggregate-wont-stop-refreshing/1264/25 Mitch says

The unfortunately-common-enough refresh problem is specific to Google AppEngine and its 60-second request timeout, which aggressively terminates the server thread, potentially in the middle of a sequence of database writes (and often before the final set of deletes that transitions the server to the newest version of a form definition).

https://github.com/opendatakit/aggregate/issues/79 proposes automating the db cleanup.

@yanokwa says:

Aggregate v1.4.10 and v1.4.11 both have patches that tried to wrap the problems with mutexes.

ggalmazor commented 7 years ago

@opendatakit-bot claim

getodk-bot commented 6 years ago

Hello @ggalmazor, you claimed this issue to work on it, but this issue and any referenced pull requests haven't been updated for 7 days. Are you still working on this issue?

If so, please update this issue by leaving a comment on this issue to let me know that you're still working on it. Otherwise, I'll automatically remove you from this issue in 3 days.

If you've decided to work on something else, simply comment @opendatakit-bot unclaim so that someone else can claim it and continue from where you left off.

Thank you for your valuable contributions to Open Data Kit!

lognaturel commented 6 years ago

Leave him alone, @opendatakit-bot, this is a big issue! 😅

ggalmazor commented 6 years ago

I think we can close this now that #154 has been merged...

yanokwa commented 6 years ago

I want to hold it open until we fix all the dupe _form issues because I know that causes refresh.

getodk-bot commented 6 years ago

Hello @ggalmazor, you have been unassigned from this issue because you have not updated this issue or any referenced pull requests for over 10 days.

You can reclaim this issue or claim any other issue by commenting @opendatakit-bot claim on that issue.

Thanks for your contributions, and hope to see you again soon!