astrodbtoolkit / astrodb-template-db

A template for astronomical databases.
https://astrodb-template-db.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
1 stars 2 forks source link

Encourage best practices as described in Chen et al. 2022 #5

Open kelle opened 1 year ago

kelle commented 1 year ago

This journal article has very specific guidance for reporting data, including source names, photometry measurements, etc. We should add documentation, comments, test, etc which encourage/empower users to follow this guidance. (e.g., tests which make sure sources names are SIMBAD-resolveable.)

https://iopscience.iop.org/article/10.3847/1538-4365/ac6268

arjunsavel commented 9 months ago

Update: added a throwaway line in the readme (12cfed5be0fbb89638c7a6a2474752a68218af46)

arjunsavel commented 5 months ago

Question: are schema validation functions called every time a database is loaded or data is added? --> should network-intensive or computationally intensive checks (like testing SIMBAD-resolvability) be data integrity tests instead of schema validation functions? @dr-rodriguez and @kelle what do you think?

kelle commented 5 months ago

data integrity tests. Also should be part of the ingestion scripts.

dr-rodriguez commented 4 months ago

To clarify, network-intensive checks should be part of data integrity checks since we want to be careful about doing too many remote calls while building the database.

For the question of validation functions- they are called every time a database value is added if you are doing so with SQLAlchemy's ORM (which is what we're moving to use). I'm actually not sure if they are used when the database is loaded- I would have to investigate or test this. I do think we should expand on our validation functions as much as we can, since it will save us work down the line. If something fails to insert via the ORM, we've caught something before needing to write post-ingest tests (which in many cases may result in users deciding to just not test that anymore instead of fixing it).