DavidT3 / DAXA

Democratising Archival X-ray Astronomy (DAXA) is an easy-to-use Python module for downloading multi-mission X-ray telescope data and processing it into usable archives. Users can acquire entire archives, or filter observations based on ID/positions/time. Supports XMM; partial support eROSITA, Chandra, NuSTAR, Swift, Suzaku, ASCA, ROSAT, INTEGRAL
BSD 3-Clause "New" or "Revised" License
15 stars 0 forks source link

More base filtering of observations required #358

Closed DavidT3 closed 2 weeks ago

DavidT3 commented 2 weeks ago

Currently I don't touch the 'ISSUE' flag that can be found in the HEASArc numaster table (the master observation list).

However when downloading the whole archive (well, all the cat files at least) the DAXA download checks produced the following errors:

daxa.exceptions.DAXADownloadError: [FileNotFoundError('The archive data directory for 20513003001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20513004001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20601024002 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20625001001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20625003001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20625004001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20625002001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20625016001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20625017001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20625024001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20625025001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20626004001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20626002001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20626003001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20626005001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20626006001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20626007001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20626008001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20626025001 does not contain the following required directories; event_cl/'), FileNotFoundError('The archive data directory for 20801033001 does not contain the following required directories; event_cl/')]

I think these are all observations with issue flags of 1, so I should just automatically filter them out and put a warning when acquiring the master table.

DavidT3 commented 2 weeks ago

Actually I just noticed how many observations are being excluded with this filter - 1755 of the 5567.

Absolutely can't do that! So we'll be reversing that decision, and the parsing of obs cat files in issue #350 will be doing some more heavy lifting.

DavidT3 commented 2 weeks ago

The parsing of observation cat files doesn't solve the initial download problems though if the user attempts to download the whole archive.