ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
Apache License 2.0
59 stars 13 forks source link

search result should not time out due to too many records returned #857

Closed amgunderson closed 4 years ago

amgunderson commented 8 years ago

Arctos has millions of specimens on it, it should be able to handle searches that yield tens of thousands of results without timing out. VertNet gave me the results I wanted when Arctos could not. Why can't I add labels to this issue?

dustymc commented 8 years ago

Agreed!

VertNet isn't a very good functional comparison (there's only one "table" and one type of query to it), but your point stands - Arctos should be able to process more data faster. Possible solutions include:

1) Oracle tuneup. We have fairly impressive hardware (for a "single box" anyway) and could certainly see significant (ish??) gains from a focused tuning effort. Access to professional consultation would be near-critical, and #104 (which needs reviewed for the current environment) and similar should be fully resolved first. 2) Throw hardware at it. VertNet is running on Google/Amazon infrastructure, which is likely prohibitively expensive for Arctos. TACC may be able to offer more, possibly via https://www.xsede.org/ecss - AC should investigate. 3) Limit the options. If we allow query only by things in FLAT and display as results only things in FLAT (which is mostly the data in VN), our performance should remain acceptable. We may be able to address this via the "simple search" discussed at the Sevillita meeting. (We'd probably also need to retain a "full search" with our current limitations??)

I believe @ccicero has limited who can assign labels pending resolution of #834 (eg, in anticipation of periodic issue review by the AC/working group/whatever it becomes).

dustymc commented 4 years ago

Tentatively closing - PG should provide a path to the resources needed for this.