mcveanlab / treeseq-inference

Work for the tree sequence inference paper.
Apache License 2.0
23 stars 9 forks source link

Availability of UKB tree sequences and GNN #71

Closed Chris1221 closed 1 year ago

Chris1221 commented 1 year ago

Hi guys,

I was reading the notebook for analyzing the UKB data here. I was wondering if the output data at various points of the notebook was deposited anywhere? Things like gnn_nuts2.csv, ukb_reverse_geocoded_nuts2.csv, the tree sequence referred to in ts_path, etc. More generally, is the pre-computed ts for the UKB available, either by request or publicly?

Thank you

hyanwong commented 1 year ago

We aren't allowed to release any intermediate data products publicly (or indeed privately!) that could be seen as releasing any of the dataset.

We could, I suppose, released some of the summary GNN values (e.g. the values actually plotted in some of the released plots). As far as I know, we haven't done so yet, however. I guess we might want to check with UKBB that this would be OK.

hyanwong commented 1 year ago

Some of the more detailed GNN analysis run rather close to the line of being able to deanonymise genomes in UKBB, so we need to be super careful, even with the data summaries.

Chris1221 commented 1 year ago

That's fair, thanks Yan. If you do decide to release any summary level GNNs, I think it would certainly be useful to the community but of course I appreciate the importance of not being able to deanonymise individuals.