Closed katewarner closed 1 day ago
Please check accuracy of the created datasets:
$ ls -ltr unreviewed/*allian*
-rw-r--r--. 1 rykahsay glygen 2656964 Jun 26 15:00 unreviewed/mouse_protein_disease_alliance_genome.csv
-rw-r--r--. 1 rykahsay glygen 2811974 Jun 26 15:00 unreviewed/rat_protein_disease_alliance_genome.csv
-rw-r--r--. 1 rykahsay glygen 1753123 Jun 26 15:00 unreviewed/fruitfly_protein_disease_alliance_genome.csv
-rw-r--r--. 1 rykahsay glygen 646313 Jun 26 15:01 unreviewed/yeast_protein_disease_alliance_genome.csv
-rw-r--r--. 1 rykahsay glygen 2918094 Jun 26 15:01 unreviewed/human_protein_disease_alliance_genome.csv
I checked *_protein_disease_alliance_genome.csv datasets and they all look accurate. There was an increase across the datasets, especially in the rat dataset, but Karina thinks it's due to the proteome update.
... and what do I need to do now? If nothing, you need to close the ticket
We have a new input for all *_protein_disease_alliance_genome.csv: downloads/alliance_genome/current/DISEASE-ALLIANCE_COMBINED.tsv
It means that instead of individual species files there is a single file containing all the species (DISEASE-ALLIANCE_COMBINED.tsv). Use the
Taxon
column to determine species.Please update your script and process the datasets: *_protein_disease_alliance_genome.csv