colin-combe commented 6 months ago

Feedback from PDB-Dev:

looks good. I was able to get all the information that we discussed. I've encountered only one issue so far:

for large entries (for instance, PXD019437), the initial request (either for sequences or residue pairs) times out after 30 s with HTML code 500 (internal server error). I assume it continues in the background and goes to the cache because, after some time (~5 min was enough for me), the same request completes in 1 or 2 seconds. Can you confirm this behavior? Would it be possible to return some other code to distinguish the real errors and delays caused by larger files so I can add some delays/repeated requests on our side?

Regards, Arthur

Yes, I presume its timing out the first time but succeeding later because the database caches results by default.

Guess there are two possible solutions; speed up the database queries or increase the allowed execution time. Maybe both these things are a good idea.

(Using subqueries before joining tables may speed things up, we're not sure about this.)

Could we first try increasing allowed execution time to see if it works when given more time?

colin-combe commented 5 months ago

Could we first try increasing allowed execution time to see if it works when given more time?

that would be something to try on your side? (@sureshhewabi , @ypriverol )

sureshhewabi commented 5 months ago

45 Need to implement a caching

colin-combe commented 5 months ago

Need to implement a caching

I have nothing against caching, but I don't think it will help with this problem because it is with the first load?

From Arthur's original error report:

the initial request (either for sequences or residue pairs) times out after 30 s with HTML code 500 (internal server error). I assume it continues in the background and goes to the cache because, after some time (~5 min was enough for me), the same request completes in 1 or 2 seconds.

As i said, I assume the reason it works second time is caching that's enabled in postgresql by default.

I think the solutions would be: speed up the database queries (might be possible); increase the allowed execution time (not great if it needs about 5 min); or some kind of paging. Maybe all these things would be good.

could we first try increasing allowed execution time to see if it works and how much more time it requires?

colin-combe commented 5 months ago

also, i'll try speeding up the queries

colin-combe commented 5 months ago

@sureshhewabi - from https://docs.pylonsproject.org/projects/waitress/en/stable/arguments.html

channel_timeout Maximum seconds to leave an inactive connection open (integer). "Inactive" is defined as "has received no data from a client and has sent no data to a client".

this might be the argument to fiddle with, if we could increase that to 300 maybe things work

colin-combe commented 5 months ago

i'm not sure what the problem is here now because for me it executes and returns quite quickly. (I tried to fill up my local database to a realistic degree before testing.)

colin-combe commented 5 months ago

on a copy of pride's current production database, the residue pair query for PXD019437 only takes about second and half for me (first execution so no caching). So i guess its not the db query that's slowing it down.

@sureshhewabi - if you want to test it on the actual database the query is:

SELECT si.id, u.identification_file_name as file, si.pass_threshold as pass,
pe1.dbsequence_ref as prot1, dbs1.accession as prot1_acc, (pe1.pep_start + mp1.link_site1 - 1) as pos1,
pe2.dbsequence_ref as prot2, dbs2.accession as prot2_acc, (pe2.pep_start + mp2.link_site1 - 1) as pos2
FROM spectrumidentification si INNER JOIN
modifiedpeptide mp1 ON si.pep1_id = mp1.id AND si.upload_id = mp1.upload_id INNER JOIN
peptideevidence pe1 ON mp1.id = pe1.peptide_ref AND mp1.upload_id = pe1.upload_id INNER JOIN
dbsequence dbs1 ON pe1.dbsequence_ref = dbs1.id AND pe1.upload_id = dbs1.upload_id INNER JOIN
modifiedpeptide mp2 ON si.pep2_id = mp2.id AND si.upload_id = mp2.upload_id INNER JOIN
peptideevidence pe2 ON mp2.id = pe2.peptide_ref AND mp2.upload_id = pe2.upload_id INNER JOIN
dbsequence dbs2 ON pe2.dbsequence_ref = dbs2.id AND pe2.upload_id = dbs2.upload_id INNER JOIN
upload u on u.id = si.upload_id
where u.id = ANY (array[12, 13, 14, 15]) and mp1.link_site1 > 0 and mp2.link_site1 > 0 AND pe1.is_decoy = false AND pe2.is_decoy = false
 AND si.pass_threshold = true;

colin-combe commented 5 months ago

maybe its actaully working, will test again when api back up

colin-combe commented 5 months ago

no, it doesn't work, its just like arthur originally said. ~I suspect waitress channel_timeout.~

colin-combe commented 5 months ago

its a problem with FastAPI https://github.com/PRIDE-Archive/xi-mzidentml-converter/issues/45

colin-combe commented 4 months ago

its a problem with FastAPI https://github.com/PRIDE-Archive/xi-mzidentml-converter/issues/45

not necessarily, but its also not to do with DB or caching of DB results

aozalevsky commented 4 months ago

@colin-combe can you please confirm that the API is working?

Seems none of the initial endpoints:

https://www.ebi.ac.uk/pride/archive/xiview/ws/projects
https://www.ebi.ac.uk/pride/archive/xiview/ws/pdbdev/projects/{pid}/sequences
https://www.ebi.ac.uk/pride/archive/xiview/ws/pdbdev/projects/{pid}/residue-pairs/based-on-reported-psm-level/passing
etc

are available right now.

I've looked through the source of the website and found: https://www.ebi.ac.uk/pride/ws/archive/crosslinking/projects endpoint, but none of the lower-level endpoints (like sequences) are accessible through it.

and /pride/ws/archive/crosslinking/pdbdev/projects/PXD019437/residue-pairs/based-on-reported-psm-level/all

from #45 aren't available too

sureshhewabi commented 4 months ago

@aozalevsky

Please note that all the URLs were changed from /pride/archive/xiview/ws to /pride/ws/archive/crosslinking

All the URLs can be found (and tested) in swagger page:https://www.ebi.ac.uk/pride/ws/archive/crosslinking/docs

colin-combe commented 4 months ago

(e.g. https://www.ebi.ac.uk/pride/ws/archive/crosslinking/pdbdev/projects/PXD019437/residue-pairs/based-on-reported-psm-level/passing )

aozalevsky commented 4 months ago

@sureshhewabi @colin-combe thanks for the clarification. I run tests (in that exact order):

Residue_pairs_all

Residue_pairs_passing

It looks like that the effect is still there, but: 1) Even the slowest cold requested was completed in ~15 s which is far less than the timeout. Huge improvement. 2) "Hot" requests from the cache are very consistent 3) I assume residue-pairs calls internally are almost identical and filtering is applied only at the final stage, thus calling all first effectively triggering caching for passing thus there is no hot/cold effect on the latter.

I think that's ok. But we still have to be aware, that if larger entries come up they can potentially time out.

The pdbdev/{pid}/sequences endpoint doesn't work and always returns empty output.

Example: https://www.ebi.ac.uk/pride/ws/archive/crosslinking/pdbdev/projects/PXD019437/sequences

{
  "data": []
}

I tried crosslinking/projects/{pid}/proteins instead but it doesn't return sequences. Sequences are crucial for us. Would it be possible to fix it?

colin-combe commented 4 months ago

I think that's ok. But we still have to be aware, that if larger entries come up they can potentially time out.

Yeah, its a potential problem. Suresh says that pagination of the data returned by REST APIs is highly recommended. I guess that would avoid any potential timeouts.

The pdbdev/{pid}/sequences endpoint doesn't work and always returns empty output.

Thanks for the bug report, yes, it was broken. @sureshhewabi - this PR should fix
https://github.com/PRIDE-Archive/xiview-api/pull/4/files

So retrieving sequences should be fixed once @sureshhewabi has updated things.

@sureshhewabi - i notice theres still string concatenation in the creation of the sql queries for the PDB-Dev endpoint (my bad). I'll fix that.

sureshhewabi commented 4 months ago

I will implement a pagination, otherwise we are serving plenty of data

aozalevsky commented 4 months ago

@sureshhewabi i see that https://github.com/PRIDE-Archive/xiview-api/pull/4/files was merged, are there any estimates of when it will be rolled out? Thanks in advance!

sureshhewabi commented 4 months ago

@aozalevsky Deployed. Please check now

I added the pagination for the /residue-pairs/based-on-reported-psm-level/passing endpoint as well. (eg: https://www.ebi.ac.uk/pride/ws/archive/crosslinking/pdbdev/projects/PXD019437/residue-pairs/based-on-reported-psm-level/passing)

aozalevsky commented 4 months ago

@sureshhewabi everything seems to be ok, thanks!

sureshhewabi commented 4 months ago

@aozalevsky Thank you for your support and comments! Could you please provide more details on this endpoint please: 1) Where are you going to use this endpoint? Will this get called by a webpage or for programatic access for a pipeline 2) Could you reproduce the test that you did previously? 3) What is the tool you were using to do the API performance test? what are the parameters?(For example, if you used Jmeter - number of threads, Ramp-up period and loops(4 I guess)

BWT, there was a bug in that query and that is why it was serving plenty of data which has been fixed now

aozalevsky commented 4 months ago

@sureshhewabi

Where are you going to use this endpoint? Will this get called by a webpage or for programatic access for a pipeline

We're going to use it programmatically for the generation of validation reports for structures deposited at https://pdb-dev.wwpdb.org/ . The goal is to establish a correspondence between raw data deposited in mzIdentML files (sequences + residue pairs) and molecules and restraints used for modeling.

Could you reproduce the test that you did previously?

I repeated the same tests (without pagination)

Residue_pairs_all

Multiple entries timed out at the first call (~30s).

Residue_pairs_passing

worked as expected (already cached after calling all)

Sequences

Here everything worked fine but surprisingly consistent. I didn't check the source code, but I assume this data was also triggered/cached during the residue-pairs call.

What is the tool you were using to do the API performance test? what are the parameters?(For example, if you used Jmeter - number of threads, Ramp-up period and loops(4 I guess)

oh, it's a very naive test, just a simple python loop:

import requests
import time
import tqdm

responses = []
for r_ in tqdm.tqdm_notebook(response['projects']):
    t0 = time.time()
    pid = r_['project_id']
    url = f"https://www.ebi.ac.uk/pride/ws/archive/crosslinking/pdbdev/projects/{pid}/residue-pairs/based-on-reported-psm-level/all"
    r = requests.get(url)
    t1 = time.time()
    print(pid, t1 - t0)
    try:
        responses.append(r.json())
    except Exception as e:
        print(pid, e)
        continue

colin-combe commented 4 months ago

@sureshhewabi - i think i just saw the same thing with https://www.ebi.ac.uk/pride/archive/crosslinking The data didn't load first time and table was left blank, but subsequent attempts did work.

aozalevsky commented 4 months ago

@sureshhewabi looks like crosslinking API is down again. All endpoints return 500 and https://www.ebi.ac.uk/pride/archive/crosslinking shows a blank page.

sureshhewabi commented 4 months ago

@aozalevsky Sorry That was my mistake today. I was doing a deployment and eventually it got broken. I am working on this, but looks like it will take more time to fix.

sureshhewabi commented 4 months ago

I realised it is because of some instability of the network in openstack. I will contact systems and get back to you

aozalevsky commented 4 months ago

@sureshhewabi thanks! i see that API is back and seems working properly. Except for one entry: PXD036833

It shows both empty sequences and empty residue pairs.

https://www.ebi.ac.uk/pride/ws/archive/crosslinking/pdbdev/projects/PXD036833/residue-pairs/based-on-reported-psm-level/passing

https://www.ebi.ac.uk/pride/ws/archive/crosslinking/pdbdev/projects/PXD036833/sequences

At the same time, https://www.ebi.ac.uk/pride/archive/crosslinking/PXD036833 seems to be ok.

aozalevsky commented 4 months ago

@sureshhewabi @colin-combe can you please check the PXD036833? since it's the only entry that is linked to PDB-Dev structures right now, it's crucial for further development and demos.

sureshhewabi commented 4 months ago

I am working on it. I did a fix in API and currently deploying it.(I locally tested it and it was fine) I will delete it and re-parse it in production and let you know.

sureshhewabi commented 4 months ago

@aozalevsky Please check now

aozalevsky commented 4 months ago

@sureshhewabi everything works as expected! thank you so much, I'll proceed with integration and testing on our side.

sureshhewabi commented 4 months ago

@colin-combe @aozalevsky can we close this?

aozalevsky commented 4 months ago

@sureshhewabi yes

PRIDE-Archive / xi-mzidentml-converter

PDB-Dev endpoint timing out #44

45 Need to implement a caching