msbentley / psa_utils

A set of utilities to work with ESA's Planetary Science Archive (PSA)
GNU General Public License v3.0
2 stars 1 forks source link

cassis products error #1

Open michaelaye opened 3 years ago

michaelaye commented 3 years ago

Thx for the package! just in time! :)

Trying to get Cassis labels (all of them). So, following your nice example notebook:

image

Then i take the ID of the first column and use it with products, but get a proxy error:

image
msbentley commented 3 years ago

Hi @michaelaye yes, this is a current limitation on the PSA side due to the performance of PDAP with large bundles. I raised a ticket on this internally back in February (PSAPCR-2534 for my own reference), but I'm not sure when it will be fixed. But I'll flag that external users need this addressed!

michaelaye commented 3 years ago

ah it's the thing you mentioned in Slack already, ok. maybe you could wrap the error message so that it would become clear to the user?

msbentley commented 3 years ago

Sure, I would have to check what other conditions may cause the same error type. I suspect that this is just happening due to a database query timing out, and that as a quick fix we could increase the time-out, but this is on the development team side to investigate.

michaelaye commented 3 years ago

I'm digging a bit into how the HTTP request works, and am wondering if you are sure that this is a server config issue due to large bundles? Because I found this similar issue that states that this behaviour is a bug in Apache 2.4.6 (used currently by the PDAP server) that was fixed in 2.4.13? https://serverfault.com/questions/834193/502-error-reading-from-remote-server-when-upstream-service-errors

michaelaye commented 3 years ago

The current configured HTTP time-out btw, is unusually short, with 2 seconds:

image

though I'm not sure how this precisely effects things, because it takes much longer to get the 502 error than 2 seconds?

msbentley commented 3 years ago

HI @michaelaye I don't think that HTTP time-out is the issue. I did some timing tests and 68 out of 7813 datasets give this issue, and the largest datasets that I can retrieve have about 50k products (0 products = failed):

image

Looking at the list that fail, they all look like big datasets (PDS4 bundles like EM16 CaSSIS, ACS, NOMAD, FREND) and large PDS3 datasets (e.g. ROSINA and COSIMA on Rosetta).

So I believe that it is still the dataset size that is the issue, but I will let the archive engineers look into this and provide a solution!