numberscope / backscope

Numberscope's back end: responsible for getting sequences and other data from the On-Line Encyclopedia of Integer Sequences, pre-processing it (factoring etc), and storing it.
MIT License
1 stars 9 forks source link

Sequence names not being downloaded #121

Open gwhitney opened 6 months ago

gwhitney commented 6 months ago

After installing #120, I grabbed all of the test_get_oeis_values sequences from numberscope.colorado.edu/api to make sure positive, zero, and negative offsets were working. I got back correct values fine, but even a half hour later, the sequence names have not been filled in. We chose these sequences to have very little metadata, so they should surely have finished grabbing the metadata. Seems to be a bug, and seems to be relatively new, as most of the sequences in the database do have their names filled in.

There's nothing in the log, either, so apparently no failed requests.

I personally think this should definitely be investigated prior to the start of the Delft team's work; what do @Vectornaut and @katestange think?

katestange commented 6 months ago

Strongly agree!

katestange commented 6 months ago

Is it possible the database got messed up previously? Or are you sure these were new to the database at the time?

gwhitney commented 6 months ago

I am sure they were just added. You can try it by picking any sequence not on the following list, especially one with light metadata, then grabbing some values, then waiting 10 minutes, and grabbing a different number of values (to be sure your browser isn't using cached results). If it still says the sequence name has not been loaded, you are seeing the bug in action.

Known IDs: A000521, A001358, A001378, A000305, A100000, A000040, A006769, A000111, A123456, A000300, A095102, A001000, A005150, A007814, A099802, A070939, A000045, A000290, A005101, A000720, A000041, A000796, A000034, A057716, A000032, A000060, A000124, A000010, A000105, A000079, A000011, A002858, A000030, A074902, A000002, A189227, A290508, A005171, A219956, A045864, A005207, A248930, A000055, A400000, A078302, A321580, A153080, A000000

katestange commented 6 months ago

Yes, then I agree this is high priority.

Vectornaut commented 6 months ago

I can check this out when I start on #69.

gwhitney commented 6 months ago

OK. If it's not clearly related to #69, let me know, because every other remaining item on the Backscope list is assigned to you: #69 and its satellites #33 and #104; and #118 because you just did the logging. So if you'd like to/need to split up the effort, the one that would seem easiest to shift to me is this one, unless it appears that it is tightly bound to #69.

Vectornaut commented 6 months ago

Whoops—this is happening because I broke fetch_metadata with some small mistakes in #111. The problem wasn't caught during review because the automated tests don't cover background work yet, and I didn't think to check manually that the sequence name and other metadata were showing up eventually.

My takeaway: when we do manual tests during pull request review, we should check that sequence names are showing up, as a quick way to verify that metadata is being fetched successfully.

gwhitney commented 6 months ago

That's fine for now. Ultimately, we need tests that leave the server running long enough -- maybe for sequences with just a little data, this is just a few seconds -- for them to get all of the background data and verify it's there.