Closed NucleonGodX closed 4 months ago
I'll review this in a bit more detail but can you add unit tests to ensure that detector works in a query as well as the construct methods create the expected queries?
I've added tests, let me know if its fine.
Based on my research and the insights provided by Eric, I have implemented the wavelength functionality. I believe we should discuss the details of this implementation and further changes needed in this week's meeting.
Can you add a changelog entry.
time = a.Time("2022-03-01 00:00", "2022-03-02 00:00")
level = a.Level(1)
product = a.soar.Product("EUI-FSI174-IMAGE")
detector= a.Detector("FSI")
result = Fido.search(instrument & time & level & product & detector)
print(result)
For some reason using detector= "FSI" in query results in an API error
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 408 Client Error: for url: http://soar.esac.esa.int/soar-sl-tap/tap/sync?REQUEST=doQuery&LANG=ADQL&FORMAT=json&QUERY=SELECT+h1.instrument,%20h1.descriptor,%20h1.level,%20h1.begin_time,%20h1.end_time,%20h1.data_item_id,%20h1.filesize,%20h1.filename,%20h1.soop_name,%20h2.detector,%20h2.wavelength,%20h2.dimension_index+FROM+v_sc_data_item%20AS%20h1%20JOIN%20v_eui_sc_fits%20AS%20h2%20USING%20(data_item_oid)+WHERE+h1.instrument='EUI'+AND+h1.begin_time%3E='2022-03-01+00:00:00'+AND+h1.begin_time%3C='2022-03-02+00:00:00'+AND+h2.dimension_index='1'+AND+h1.level='L1'+AND+h1.descriptor='eui-fsi174-image'+AND+h2.Detector='FSI'
if the query is done without taking detector in it, it gives normal response
like for,
result = Fido.search(instrument & time & level & product)
Even with an invalid detector in the query we get the normal 0 results output.
And for other detectors also it works perfectly fine for example
time = a.Time("2022-03-01 00:00", "2022-05-02 00:00")
detector= a.Detector("HRI_EUV")
result = Fido.search(instrument & time & level & detector)
22706 Results from the SOARClient:
Instrument Data product Level ... Detector Wavelength
...
---------- ------------------- ----- ... -------- ----------
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
... ... ... ... ... ...
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
EUI eui-hrieuv174-image L1 ... HRI_EUV 174.0
So the website says:
<INFO name="QUERY_STATUS" value="ERROR"> Cannot parse query 'SELECT h1.instrument, h1.descriptor, h1.level, h1.begin_time, h1.end_time, h1.data_item_id, h1.filesize, h1.filename, h1.soop_name, h2.detector, h2.wavelength, h2.dimension_index FROM v_sc_data_item AS h1 JOIN v_eui_sc_fits AS h2 USING ' for job '1718213015729OPE': Encountered "<EOF>" at line 1, column 236. Was expecting: "(" ... </INFO>
Could we have formated something incorrectly?
Also we should really try to get this error messages out to the user in the future.
I think you opened the link generated upto the http://soar.esac.esa.int/soar-sl-tap/tap/sync?REQUEST=doQuery&LANG=ADQL&FORMAT=json&QUERY=SELECT+h1.instrument,%20h1.descriptor,%20h1.level,%20h1.begin_time,%20h1.end_time,%20h1.data_item_id,%20h1.filesize,%20h1.filename,%20h1.soop_name,%20h2.detector,%20h2.wavelength,%20h2.dimension_index+FROM+v_sc_data_item%20AS%20h1%20JOIN%20v_eui_sc_fits%20AS%20h2%20USING%20(data_item_oid)+WHERE+h1.instrument=
which to me it gives
<INFO name="QUERY_STATUS" value="ERROR"> Job timeout/aborted. </INFO>
<INFO name="HttpErrorCode" value="408"/>
<INFO name="Action" value="Handling request"/>
I think you opened the link generated upto the http://soar.esac.esa.int/soar-sl-tap/tap/sync?REQUEST=doQuery&LANG=ADQL&FORMAT=json&QUERY=SELECT+h1.instrument,%20h1.descriptor,%20h1.level,%20h1.begin_time,%20h1.end_time,%20h1.data_item_id,%20h1.filesize,%20h1.filename,%20h1.soop_name,%20h2.detector,%20h2.wavelength,%20h2.dimension_index+FROM+v_sc_data_item%20AS%20h1%20JOIN%20v_eui_sc_fits%20AS%20h2%20USING%20(data_item_oid)+WHERE+h1.instrument=
which to me it gives
<INFO name="QUERY_STATUS" value="ERROR"> Job timeout/aborted. </INFO> <INFO name="HttpErrorCode" value="408"/> <INFO name="Action" value="Handling request"/>
Ah I just copied the url incorrectly.
@ebuchlin or @hayesla, if we hit a timeout like this, who can we contact to ask about it?
We already had similar non-expected timeout issues for different (simpler) requests, and they have been solved on SOAR side. I tried to write the join condition explicitly (h1.data_item_oid=h2.data_item_oid
). If I limit the number of results explicitly (SELECT TOP 10
) I get something but it still takes too long (>30s). So I guess we should ask SOAR (I can do it).
Hi guys, you ask me 😊 Also I'd be really grateful if you could nudge me when there are things missing from the SOAR help pages - they need an overhaul 😢 I'm looking into this timeout, but instead of v_eui_sc_fits.detector (which only gives HRI_LYA, HRI_EUV or FSI), you could use v_sc_data_item.sensor (which contains at least fsi174, fsi304, hrieuv174, hrilya1216)? However, right now, even "SELECT DISTINCT sensor FROM v_sc_data_item WHERE instrument='EUI'" times out, so I'll get onto the development team. Thanks, H
For example:
SELECT h1.instrument, h1.descriptor, h1.level, h1.begin_time, h1.end_time,
h1.data_item_id, h1.filesize, h1.filename, h1.soop_name, h1.sensor,
h2.wavelength, h2.dimension_index
FROM v_sc_data_item AS h1
JOIN v_eui_sc_fits AS h2 USING (data_item_oid)
WHERE h1.instrument='EUI' AND h1.begin_time>='2022-03-01T00:00:00'
AND h1.begin_time<='2022-03-02T00:00:00' AND h2.dimension_index='1'
AND h1.level='L1' AND h1.sensor LIKE 'FSI%'
gave me the 180 results I was expecting.
Hi again, I should have mentioned that I'm the Solar Orbiter Archive Scientist. The v_sc_data_item.sensor is indexed, and so works better than v_eui_sc_fits. Also, there is a ticket to fix the lack of end time in the v_eui_sc_fits view. I can prioritise it now I know you need it.
Hi @herroyalmaj thanks for this info!
Any idea what the timeout issue is?
for SPICE, the wavemin/wavemax only has the first window (even if there is data in the file from other windows), hence it would be unclear and potentially misleading to a user if we return this information. Hence what we should do for SPICE is that this information is not returned from the table and a user cannot search over these.
Hi again, I should have mentioned that I'm the Solar Orbiter Archive Scientist. The v_sc_data_item.sensor is indexed, and so works better than v_eui_sc_fits. Also, there is a ticket to fix the lack of end time in the v_eui_sc_fits view. I can prioritise it now I know you need it.
Thanks for the advice to use this indexed column. Maybe indexed columns could be indicated in the Tables, Views, and Columns help page? I understand that timeouts can be expected when using WHERE conditions on non-indexed columns.
@NucleonGodX I think the last thing to do is to not return the PHI wavelengths now until we resolve the issue with the A vs nm. So I think we should just not return it in this PR. And similarly we should not return this for SPICE. Like we discussed on our call yesterday/
So for these both, we should maybe not even return the wavelength column. Let me know if this isnt clear.
Apart from these this PR is good to go!
Thanks for the reviews @ebuchlin and @hayesla.
I will run the CI and merge if it passes.
Thanks for the first stage of the project @NucleonGodX, you can now rebase the other two PRs to remove the commits that came from this PR.
This Pull Request is part of Google Summer of Code (GSoC) 2024, focused on enhancing metadata support for SOAR in sunpy-soar (as discussed in issue #46 ). This initial phase involves integrating attributes from the existing sunpy.net attribute system. Specifically, join methods have been developed, and the Detector attribute has been successfully implemented for remote-sensing instruments.