Closed Fedja closed 1 year ago
Alright, let's think about this.
The phenotype info is straightforward as soon as we have the single file where these things are available.
The previous release data is quite straightforward, it's basically similar to the other annotations, the file is just different. Should not be too hard to implement, most likely the previous release files just need to be added to the wdl input array for this to be easily done. It probably makes sense to separate the WDL work from the script changes into their own PRs.
The UKBB replication could be done either as a separate comparison datasource, or we could add it as an annotation. The main challenge here is probably how we couple the UKBB data to the script - Do we have some sort of API or some other file from which we pull these results.
The longname, category, n cases/controls are found in a file named finngen_R5_pheno_n.tsv in green library. I'll add that as an annotation file. I'll use that as the phenotype info file.
GenenTech have been scraping and joining additional information to top report and variant reports, which has been complicated by changing file structure from our part. We should deliver these automatically. Attached are example files from Sarah Pendergrass.
These should be in R6 reportings!
Main things:
EnrichedVariants_Example.txt GroupReports_Example.txt