[genomic support] Use column of dataset A as filter in dataset B

ada-discovery / ada-issues

0 stars 0 forks source link

[genomic support] Use column of dataset A as filter in dataset B #194

Open sherzinger opened 3 years ago

sherzinger commented 3 years ago

There will be other feature requests for genomic support, but I believe this is a fundamental first step. Please feel free to give your own input or suggest something entirely different!

Scenario: We have two datasets:

clinicalData:

ID age gender ...
s123 40 f ...
s124 50 m ...
...

genomicalData (one of several ways to structure it):

sample_id gene1 gene2 ...
s123 0.1 0.2 ...
s124 0.3 0.4 ...

Use case description: I want to be able to filter the clinicalData dataset by age < 45 && gender == f. Then I want to see the genomicalData dataset for the filtered samples. This could be achieved by interacting with the column ID in the clinicalData dataset, which opens a dialog with two parameters: target dataset and target dataset column, which in the above scenario would take the values genomicalData and sample_id. After confirming the dialog I am redirected to genomicalData with the following filter already applied: sample_id == [s123, ...].

LBolzani commented 2 years ago

Hello @sherzinger I would like to introduce these things in the omics view:

~~Checkbox buttons on the left side of the table in order to select the records to match with other omics dataset~~;
A flag in datasettings page to highlight that the dataset is omics type;
A button on the top right of the view to trigger for the matching with other omics dataset.

Luca

sherzinger commented 2 years ago

Sounds good to me. The omics flag would allow us to make and enforce some assumptions about the data type for the later steps of this feature.

LBolzani commented 2 years ago

Hello @sherzinger I did an update in https://10.240.16.149, regarding Omics integration. Let's talk when we're both back from xmas holidays.

Luca

LBolzani commented 2 years ago

Hello @soumyabrataghosh I did an update in https://10.240.16.149. Let us know.

gh-osh commented 2 years ago

Yes .. I checked that .. one thing came to my mind .. can we join more than two tables .. like Table T1 with Table T2 on person_id then joined(T1,T2) with Table T3 on visit_id Also I have another comment about the array feature .. or more generic way how to handle time-series data … (visit_id becomes the secondary primary key) or .. person_id-visit_id becomes the primary key

But otherwise this feature is a giant leap, no doubt of it.

LBolzani commented 2 years ago

@soumyabrataghosh first part of your consideration, regarding cross join between datasets, is present yet in the current implementation. It's possible make join between tables T1 and T2 and use the result for another table T3.