IGS / gEAR

The gEAR Portal was created as a data archive and viewer for gene expression data including microarrays, bulk RNA-Seq, single-cell RNA-Seq and more.
https://umgear.org
GNU Affero General Public License v3.0
10 stars 5 forks source link

Multigene curations ready to go live #197

Open beamilon opened 2 years ago

beamilon commented 2 years ago

I have curated multi-gene displays for all datasets in the following profiles (very few have issues and I am keeping track of them and will create tickets later):

jorvis commented 2 years ago

OK, and what user were you logged in as when you made these curations?

beamilon commented 2 years ago

Curator

beamilon commented 2 years ago

@jorvis: some datasets seem to have escaped the transfer of owner for multigene display. If it is just because you are still implementing them, SORRY.

beamilon commented 2 years ago

@jorvis, I added multigene display to these additional datasets.

jorvis commented 2 years ago

I haven't done anything here yet. Are these datasets all within the same profiles as mentioned initially?

beamilon commented 2 years ago

The first list yes. The second list are datasets in NIHL, cell type-specific transcriptomics, mouse cochlea (Hertzano 2021) and in Noise/Damage/Protection.

beamilon commented 2 years ago

More curations done today:

jorvis commented 2 years ago

OK. I'll tackle these tonight.

beamilon commented 2 years ago

Here are more. I will try to finish tomorrow

beamilon commented 2 years ago

Last datasets for now:

jorvis commented 2 years ago

This is ready and has run on the Adult profile as a test. When I added your expanded list of datasets though there was at least one instance where there were more than one dataset title with the exact same title. Example:

mysql> select id, date_added, marked_for_removal from dataset where title = 'Spiral Ganglion Neurons (SGN) Response to PTS
-inducing noise (scRNA-seq), violin plot display';
+--------------------------------------+---------------------+--------------------+
| id                                   | date_added          | marked_for_removal |
+--------------------------------------+---------------------+--------------------+
| 11f16f75-bfdd-0397-c9c1-e4cf35db5287 | 2021-01-31 22:58:37 |                  0 |
| 482cb81f-4816-e4e3-a3fe-514707b847d8 | 2021-01-28 03:54:28 |                  0 |
+--------------------------------------+---------------------+--------------------+

We can talk and resolve this tomorrow.

beamilon commented 2 years ago

Some datasets have several windows with different displays in the single gene search. The dataset is the same but the title is slightly different like the one below. Is this the problem? I took advantage of the multiple windows to create different multigene displays. Spiral Ganglion Neurons (SGN) Response to PTS-inducing noise (scRNA-seq), UMAP display Spiral Ganglion Neurons (SGN) Response to PTS-inducing noise (scRNA-seq), violin plot display

jorvis commented 2 years ago

No, it's totally fine to have the display variations in the title as you've shown just there. The ones which are causing an error are those where the titles are absolutely identical. Here are those datasets owned by curator and how many times that title is found. There is even one which is loaded 9 times.


mysql> select title, count(title) from dataset where marked_for_removal = 0 and owner_id = 499 group by title having count(title) > 1;
+-----------------------------------------------------------------------------------------------+--------------+
| title                                                                                         | count(title) |
+-----------------------------------------------------------------------------------------------+--------------+
| scRNAseq, Atoh1+Ikzf2 overexpression in adult supporting cells, published UMAP (Liu, 2021)    |            2 |
| P2, mouse, scATAC-seq, organ of Corti apex and base (Waldhaus)                                |            2 |
| P1, mouse, scRNA-seq, cochlear epithelium,subset50,demo1 (Kelley)                             |            9 |
| Spiral Ganglion Neurons (SGN) Response to PTS-inducing noise (scRNA-seq), violin plot display |            2 |
| Elena upload test                                                                             |            2 |
| CD45+ Cochlear Immune Cells Response to PTS-inducing noise, scRNA-seq, violin plot display    |            2 |
| Lateral Wall Response to PTS-inducing noise (scRNA-seq), violin plot display                  |            2 |
| Lateral Wall Immune Cells Response to PTS-inducing noise, scRNA-seq, violin plot display      |            2 |
+-----------------------------------------------------------------------------------------------+--------------+
beamilon commented 2 years ago

I see. I will check with Ronna because I am sure that some of the duplicates can be removed but we need to make sure that even though the title is the same, the data within them are exactly the same.

jorvis commented 2 years ago

All of the datasets within the profiles initially posted in this ticket have been updated.

Can do the rest when we work out the duplication issue.

beamilon commented 2 years ago

I started looking at the duplicates.

beamilon commented 2 years ago

All the needed datasets have multiple displays now. So I think this ticket can be closed.

beamilon commented 2 years ago

I added a display for P0, mouse, RNA-seq, hair cells vs epithelial non-hair cells (Hertzano) So ticket not ready to be closed I guess.

beamilon commented 2 years ago

I think that these 2 datasets were not transferred as default. They are in curator but not in my account 4dpf, zebrafish, microarray, hair cell, mantle cell and skin (Hudspeth) 5dpf, zebrafish, RNA-seq, hair cell (IP) and whole larvae input (IN) (Hertzano)

Now that the following dataset has been fixed, I created a multigene display 4dpf, zebrafish, RNA-seq, TU-tagged hair cells mRNA and whole larvae input (Nicolson)