NAVADMC / ADSM

A simulation of disease spread in livestock populations. Includes detection and containment simulation.
Other
10 stars 5 forks source link

Caught a 500 error #972

Closed missyschoenbaum closed 4 years ago

missyschoenbaum commented 4 years ago

Our collaborators are getting this error (note I xx'd out identifiing info). Have asked for files from logs and an example iteration if possible to find one that failed.

Symptom - app was running and not completing all iterations. We did confirm that it is running on an Intel chip.

Output file text

Copying C:\Users\XXXXXX\Documents\ADSM Workspace\SCENARIOXXXX.sqlite3 to C:\Users\XXXXXXXXX\Documents\ADSM Workspace\settings\activeSession.sqlite3 . This could take several minutes... Sessions overwritten with C:\Users\XXXXXX\Documents\ADSM Workspace\SCENARIOXXXX.sqlite3 Checking Database states... Operations to perform: Synchronize unmigrated apps: floppyforms, webpack_loader, crispy_forms, productionserver, humanize, staticfiles, messages Apply all migrations: Results, admin, contenttypes, sessions, auth, ScenarioCreator, ADSMSettings Synchronizing apps without migrations: Creating tables... Running deferred SQL... Installing custom SQL... Running migrations: No migrations to apply. Operations to perform: Synchronize unmigrated apps: floppyforms, webpack_loader, crispy_forms, productionserver, humanize, staticfiles, messages Apply all migrations: Results, admin, contenttypes, sessions, auth, ScenarioCreator, ADSMSettings Synchronizing apps without migrations: Creating tables... Running deferred SQL... Installing custom SQL... Running migrations: No migrations to apply. Done migrating databases. C Engine Exit Code: 0 Starting Unit Stat creation Finished Unit Stat creation Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Caught a 500 error! Copying database to UK Trial d_37_6dc_Full Done Copying database to C:\Users\cjhanthorn\Documents\ADSM Workspace\UK Trial d_37_6dc_Full.sqlite3

missyschoenbaum commented 4 years ago

More files requested from collaborator, will post when I receive.

missyschoenbaum commented 4 years ago

output_maskedname.txt error.log server_error.log server_output.log

missyschoenbaum commented 4 years ago

Let me know if I need anything else.

BryanHurst commented 4 years ago

There is a little bit to go off of here. It tells us what stage crashes, but not what type of error it was since debug isn't on.

It would be helpful to have a scenario that recreates this, or have them run their scenario in a beta client. The beta client gives more detailed error outputs.

missyschoenbaum commented 4 years ago

They are sending files. Here's the preview - Database disk malformed image

missyschoenbaum commented 4 years ago

Here are the files, one with the full yellow screen text. DatabaseError at.docx server_error.log server_output.log

missyschoenbaum commented 4 years ago

Do you need an iteration? I looked at them and they just stop writing data. There is no error message captured.

missyschoenbaum commented 4 years ago

An additional note. She says "As a separate update, this scenario ran completely on the original version on another machine we put it on, just to further muddy the waters for you. " Original version here means not Beta.

BryanHurst commented 4 years ago

If it ran on another machine fine, and has never run on this machine, I would say that the scenario file got corrupt during transfer.

It it has at some point worked on the current machine, then we need to see how we ended up corrupting the database locally.

Have they used a SQLite Viewer or any other program to inspect the database file outside of the ADSM program?

missyschoenbaum commented 4 years ago

No successful run on that machine. Seems to fail in the same iterations repeatedly. What should I have them look for in the database?

BryanHurst commented 4 years ago

I'd ask if they can try transferring the database from the original computer again.

If that still has issues, I'll look into creating a utility in the ADSM program to investigate corrupt sqlite discs.

missyschoenbaum commented 4 years ago

They reloaded the database file. This time it stopped at iteration 9. The iterations just stop in the middle, no error is captured. server_error.log server_output.log iteration29.log

Don't know why iterations are numbered to 29 if it stopped at 9.

Could it be a fault on the disc of the PC?

missyschoenbaum commented 4 years ago

More notes, I asked about the 9 iterations vs 29 iterations. She said the progress bar said 9, but 31 files were output.

BryanHurst commented 4 years ago

I wouldn't worry about the mismatch of numbers.

The progress bar notes iterations that have completed fully and successfully.
While iterations that crash still write incremental output but don't finish writing the file.

We might need to talk about this one in the meeting.

It does sound like we do need to put in handling of corrupt database files though.

missyschoenbaum commented 4 years ago

first test, going to delete a table with app closed, but held in settings (last one opened). Opened with SQLite studio and deleted SC_airbornespread. Constraint failed, would not allow deletion. Yeah! Going to try SC_dieasespreadassignment. I was deleted, as it had no relationships.

missyschoenbaum commented 4 years ago

Note here, I am getting rolling Pop map error probably si PopMapagain nce I changed back into a directory.

missyschoenbaum commented 4 years ago

If it was the last database in setting, the delete didn't matter. It just overwrote with settings and went on working. Now I will try shifting it out of settings and see what happens. This time will try SC_dieasesprogressionassignment, as it is easier to manipulate in the app. It did exactly what Bryan predicted, error with no table found. delete table

missyschoenbaum commented 4 years ago

So, we can confirm that deleted a database object was not the problem.

BryanHurst commented 4 years ago

@BryanHurst we should manually corrupt a database to see if we can replicate the same error and check that it is catchable.

BryanHurst commented 4 years ago

After intentionally corrupting my activeSession.db file, I do get the same errors as seen above.

image

When done in Production, it just throws the 500 error over and over instead of halting at the Django error screen.

This shows that their issue is indeed a problem with their scenario file (probably after transferring to the machine).

I'd stick with the current suggestion and enable the Django error pages in our next Production build so users get a better indication that something is wrong with their scenario file.

missyschoenbaum commented 4 years ago

Sounds good to me. Do we need to leave this issue open? I need to add to known bugs.

BryanHurst commented 4 years ago

981 is for showing error messages in Production, so you can close this once it is logged in Known Bugs.

missyschoenbaum commented 4 years ago

Wiki done.