paleobot / pbot-dev

Codebase and initial design documents for pbot client
MIT License
2 stars 2 forks source link

Figure out why dev is running so slow #278

Closed NoisyFlowers closed 1 month ago

NoisyFlowers commented 4 months ago

Dev is crawling lately. Might have to do with recent bulk uploads. The db is much bigger now.

Look into neo4j indexing and anything else that seems likely

NoisyFlowers commented 4 months ago

I noticed that graphql queries were being submitted twice. I found some chatter about a bug in the apollo client, so I updated the package.json to use v3.9.5 (previously v3.4.5).

This appears to have fixed the duplicate query problem.

NoisyFlowers commented 4 months ago

Since pbotID are the fundamental connective tissue between nodes, I created indexes for all pbotIDs.

CREATE INDEX node_range_role_pbotid IF NOT EXISTS FOR (n:Role) ON (n.pbotID);
CREATE INDEX node_range_preservationmode_pbotid IF NOT EXISTS FOR (n:PreservationMode) ON (n.pbotID);
CREATE INDEX node_range_feature_pbotid IF NOT EXISTS FOR (n:Feature) ON (n.pbotID);
CREATE INDEX node_range_organ_pbotid IF NOT EXISTS FOR (n:Organ) ON (n.pbotID);

CREATE INDEX node_range_person_pbotid IF NOT EXISTS FOR (n:Person) ON (n.pbotID);
CREATE INDEX node_range_group_pbotid IF NOT EXISTS FOR (n:Group) ON (n.pbotID);

CREATE INDEX node_range_reference_pbotid IF NOT EXISTS FOR (n:Reference) ON (n.pbotID);

CREATE INDEX node_range_otu_pbotid IF NOT EXISTS FOR (n:OTU) ON (n.pbotID);
CREATE INDEX node_range_synonym_pbotid IF NOT EXISTS FOR (n:Synonym) ON (n.pbotID);
CREATE INDEX node_range_comment_pbotid IF NOT EXISTS FOR (n:Comment) ON (n.pbotID);
CREATE INDEX node_range_collection_pbotid IF NOT EXISTS FOR (n:Collection) ON (n.pbotID);
CREATE INDEX node_range_description_pbotid IF NOT EXISTS FOR (n:Description) ON (n.pbotID);
CREATE INDEX node_range_characterinstance_pbotid IF NOT EXISTS FOR (n:CharacterInstance) ON (n.pbotID);
CREATE INDEX node_range_specimen_pbotid IF NOT EXISTS FOR (n:Specimen) ON (n.pbotID);
CREATE INDEX node_range_image_pbotid IF NOT EXISTS FOR (n:Image) ON (n.pbotID);

CREATE INDEX node_range_schema_pbotid IF NOT EXISTS FOR (n:Schema) ON (n.pbotID);
CREATE INDEX node_range_character_pbotid IF NOT EXISTS FOR (n:Character) ON (n.pbotID);
CREATE INDEX node_range_state_pbotid IF NOT EXISTS FOR (n:State) ON (n.pbotID);

I did not collect exhaustive A/B data. But, prior to this, Reference queries with <> tests on pbotID were taking ~400ms. Now they are taking ~100ms.

ecurrano commented 3 months ago

It is definitely better! The specimen page was the worst, and while it still takes the dropdown menu seconds to load (unsurprising given how big the list is), the magnifying glass search is available in less than a second. I think we will probably want to eliminate the specimen dropdown regardless, as there are going to be so, so, so many specimens!