IGS / gEAR

The gEAR Portal was created as a data archive and viewer for gene expression data including microarrays, bulk RNA-Seq, single-cell RNA-Seq and more.
https://umgear.org
GNU Affero General Public License v3.0
10 stars 5 forks source link

Purge schema of unused columns/tables #32

Open jorvis opened 2 years ago

jorvis commented 2 years ago

This obviously isn't critical but it's bothering me so I wanted to make note of it. There are many columns and even tables which were part of the initial schema and design but which have fallen unused are are no longer needed. It would be good to rid the schema of these, along with any API calls expecting them, and fully test the system afterwards to make sure none were missed.

Candidates to remove include:

anatomy.*

dataset.load_status dataset.has_h5ad dataset.plot_default

dataset_epiviz.* (@jkanche needs to confirm here)

layout_members.math_preference layout_members.plot_preference

supplemental_images.* guser.help_id (and all related API code)

API has references to dataset.primary_key in some places

@adkinsrs Let me know if you think any of these should be kept.

adkinsrs commented 2 years ago

I know there is the math preference field for a dataset (log2 vs log10) but I assume layout.math_preference was originally going to adjust/normalize the log scale for all datasets in a layout, correct?

Is it possible that the "anatomy" table would have some application for NeMO Analytics?

Regarding the 3 "dataset" fields, I feel that load_status and plot_default are OK to remove, but I have encountered instances (such as on my Docker instance, and very rarely in the devel/production server) where the dataset was missing the h5ad file, and I would expect queries using has_h5ad=1 to fail there.