Open jacobdgm opened 1 year ago
Set your settings.py
to DEBUG=True
on production and restart the service. Trigger the problem, get the traceback, and then set it back.
If the user's initials are P.H. this is the same problem I mentioned in #905 (he later reported things were working, and I think has since added some chants.) And I have been corresponding with another person working on this manuscript who didn't report anything. But I got an error just now too (on production but not staging) so...? It may or may not be relevant that this source is being actively worked on by several people at once, which I don't think is the case for any other sources at the moment.
@annamorphism: no, the person's initials are K.K. Do you know which source was causing the issues for #905? Might be helpful.
Set your settings.py to DEBUG=True on production and restart the service. Trigger the problem, get the traceback, and then set it back.
This was a good suggestion! We just tried it, and still got our 502 page, and not a traceback.
Next step: I downloaded a copy of the database on production - hopefully i can plug it in locally and recreate the bug there.
Ok. In that case it is all the same source, but three different users have now reported difficulties with it (in addition to you and me.) Could it have to do with multiple people accessing this source at the same time (some sort of race condition?)? It would explain why it doesn't seem to happen on staging/locally.
Update: this is fixed on Production currently! The last chant added in the source was "Gloria patri" (cantus id = 909000), and we have a ton of gloria patris in the database. So it was taking way too long to go through all these chants to create a list of suggestions for adding subsequent chants.
The current fix running now on production is to simply bail out early and not provide suggestions if Gloria patri is the most recently added chant. We have a handful of ideas how to improve the performance of the suggestion feature, which we can try implementing over the next few days/weeks.
(big credit to @lucasmarchd01 and @jackyyzhang03 for helping debug this, and to @dchiller in the initial steps yesterday!)
Interesting! "Gloria Patri" does indeed lead to 502s on staging but not on production as of now. Wild! I don't think there are too many other chants that will have the same order of magnitude of results, but there could be a few other slightly problematic ones lurking.
I don't think there are too many other chants that will have the same order of magnitude of results, but there could be a few other slightly problematic ones lurking.
To your last point, we checked this out as well. It turns out that none of the other chants have the same order of magnitude as "Gloria Patri". Here are the cantus id's with the highest counts:
[{'cantus_id': '909000', 'cantus_id_count': 9709}, {'cantus_id': '008097', 'cantus_id_count': 931}, {'cantus_id': '008081', 'cantus_id_count': 886}, {'cantus_id': '001328', 'cantus_id_count': 861}, {'cantus_id': '909030', 'cantus_id_count': 792} .......
very cool, @lucasmarchd01 ! Now I'm all set for CantusTriviaNight: "what is the SECOND most common chant in the Cantus Database?"
This might be quite easy, sometime soon. the /json-nextchants
and /json-nextfeasts
CI endpoints are currently listed as "under construction" on the CI API readme - once they're complete, we can just fetch the data from there.
I don't know how likely it is that these endpoints will become available within the next month, say. With this in mind, going to assign @lucasmarchd01 to follow up on this when the time comes.
We recently were forwarded an email by Debra:
For reference, this is the source: https://cantusdatabase.org/source/637892
@dchiller and I spent some time looking into this, and we're stumped. Adding chants to other sources works just fine. Adding chants to this source on staging and locally works fine - it's only broken on production.
We tried restarting the docker containers on Production, and that didn't fix it.
Looking at the nginx logs, this is the error message that is being logged for this request:
2023/08/01 19:51:47 [error] 21#21: *717092 upstream prematurely closed connection while reading response header from upstream, client: 132.216.191.196, server: , request: "GET /chant-create/637892 HTTP/2.0", upstream: "http://172.30.0.3:8000/chant-create/637892", host: "cantusdatabase.org", referrer: "https://cantusdatabase.org/source/637892"