IGS / gEAR

The gEAR Portal was created as a data archive and viewer for gene expression data including microarrays, bulk RNA-Seq, single-cell RNA-Seq and more.
https://umgear.org
GNU Affero General Public License v3.0
13 stars 4 forks source link

Private datasets accessible in public profiles #259

Closed adkinsrs closed 2 years ago

adkinsrs commented 2 years ago

I was attempting to work on an issue by @beamilon and came across a private dataset in a public profile

Profile - scRNA-seq - Chicken basilar papilla (Heller 2021) Dataset - Chicken Basilar Papilla Baseline, P7, svg,scRNA-seq (Janesick), geneID (others in the profile are the same)

In this case, the datasets in this profile are owned by "curator" and when I am logged in as "Shaun Adkins", I can view single and multigene displays. But when I click on the "multigene viewer" button to go to the curator, the page will not load the dataset (nor is it searchable) because the dataset is not public, nor shared with my user.

So the first question is if this is intended behavior? If not, there is the potential worry that this is a backdoor way of exposing private datasets to a public environment. @beamilon has told me she will look to see if there are other private dataset/public profile situations too.

beamilon commented 2 years ago

Here is the list of private datasets found in public profiles:

Central auditory system

Hair cell and supporting cell transcriptomics and epigenomics, Mouse (Segil 2021)

Lateral wall

NIHL, cell type-specific transcriptomics, mouse cochlea (Hertzano 2021)

Noise/Damage/Protection

Regenerating Neuromast (Piotrowski)

scRNA-seq - Chicken dying hair cells

scRNA-seq - P2 cochlea (Heller 2021)

jorvis commented 2 years ago

The behavior of showing all datasets in public profiles is as intended. I believe what has happened here is that datasets which were initially private were released simply by putting them in a public profile rather than also taking the step of making the individual dataset public too.

I'll get Ronna to verify that these are all meant to be open, and also add a warning to the dataset manager on any attempts to add a private dataset to a public profile.

jorvis commented 2 years ago

I have changed all of these with the following queries:

UPDATE dataset SET is_public = 0 WHERE is_public IS NULL;
UPDATE dataset SET is_public = 1 WHERE id IN (
  SELECT did FROM (
       SELECT d.id AS did
         FROM dataset d
              JOIN layout_members lm ON lm.dataset_id=d.id
              JOIN layout l ON lm.layout_id=l.id
        WHERE l.is_domain = 1
          AND d.is_public = 0
  ) as dats
);

Why the silly nested SELECTs? For reasons.