Closed nlwashington closed 8 years ago
also, wbpaper xrefs: http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=WpaXref
note that previously (in disco) we got additional phenotype annotation files from wormmine, that were only partially overlapping with those in the static files above:
ANNOTATION_FILE="http://www.wormbase.org/tools/wormmine/service/query/results?format=tab&start=0&query=%3Cquery+model%3D%22genomic%22+view%3D%22Allele.primaryIdentifier+Allele.symbol+Allele.naturalVariant+Allele.method+Allele.gene.primaryIdentifier+Allele.gene.secondaryIdentifier+Allele.gene.symbol+Allele.gene.chromosome.primaryIdentifier%22+sortOrder%3D%22Allele.primaryIdentifier+ASC%22+%3E%3Cjoin+path%3D%22Allele.gene%22+style%3D%22OUTER%22%2F%3E%3Cjoin+path%3D%22Allele.gene.chromosome%22+style%3D%22OUTER%22%2F%3E%3C%2Fquery%3E" wget -nv -O wb_allele.txt $ANNOTATION_FILE check_errs $? "wget error" log "Downloading annotation file" ANNOTATION_FILE="http://www.wormbase.org/tools/wormmine/service/query/results?format=tab&start=0&query=%3Cquery+model%3D%22genomic%22+view%3D%22BioEntity.primaryIdentifier+BioEntity.symbol+BioEntity.phenotypesObserved.identifier+BioEntity.phenotypesObserved.name%22+sortOrder%3D%22BioEntity.primaryIdentifier+ASC%22+%3E%3C%2Fquery%3E" wget -nv -O wb_extra_variant_phenotypes.txt $ANNOTATION_FILE check_errs $? "wget error" log "Downloading annotation file" ANNOTATION_FILE="http://www.wormbase.org/tools/wormmine/service/query/results?format=tab&start=0&query=%3Cquery+model%3D%22genomic%22+view%3D%22BioEntity.primaryIdentifier+BioEntity.symbol+BioEntity.phenotypesNotObserved.identifier+BioEntity.phenotypesNotObserved.name%22+sortOrder%3D%22BioEntity.primaryIdentifier+ASC%22+%3E%3C%2Fquery%3E" wget -nv -O wb_phenotypes_not_observed.txt $ANNOTATION_FILE check_errs $? "wget error"
note that there is also a wormmine python api
also, the KO alleles: ftp://ftp.wormbase.org/pub/wormbase/releases/current-production-release/species/c_elegans/$BIOPROJECT/annotation/c_elegans*.knockout_consortium_alleles.xml.gz
FWIW, we also scrubbed the following from gff3:
for col2 in Allele Mos_insertion_allele ; do grep -P "\t$col2\t" c_elegans.annotations.gff3 ; done > allele_dump.gff3 for col2 in Coding_transcript Genomic_canonical Non_coding_transcript Orfeome Promoterome Pseudogene RNAi_primary RNAi_secondary Reference Transposon Transposon_CDS cDNA_for_RNAi miRanda ncRNA operon polyA_ signal_sequence polyA_site snlRNA ; do grep -P "\t$col2\t" c_elegans.annotations.gff3 ; done > genomic_feat_dump.gff3
first pass on this is done using just the data from ftp (not from wormmine), but not including the orthology, interaction, or expression data.
the data in wormmine appears to be 2+ years old; i am hesitant to add the data from that page.
the task of adding interaction data is moved to ticket #214 .
the bulk of wormbase is finished, and includes:
what is to be done in the future (when we can deal with it) is: genes in development: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/ONTOLOGY/development_association.WS249.wb or in anatomical location: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/ONTOLOGY/anatomy_association.WS249.wb
Wooooot!
congrats!
we need to pull in the wormbase geno-pheno data:
ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/
rnai phenotypes: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/ONTOLOGY/rnai_phenotypes.WS249.wb ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/ONTOLOGY/rnai_phenotypes_quick.WS249.wb
other phenotypes: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/ONTOLOGY/phenotype_association.WS249.wb
genes in development: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/ONTOLOGY/development_association.WS249.wb or in anatomical location: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/ONTOLOGY/anatomy_association.WS249.wb
feature locations: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/species/c_elegans/PRJNA13758/c_elegans.PRJNA13758.WS249.annotations.gff3.gz
xrefs: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/species/c_elegans/PRJNA13758/c_elegans.PRJNA13758.WS249.xrefs.txt.gz
papers we had to get elsewhere.
gene ids: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/species/c_elegans/PRJNA13758/annotation/c_elegans.PRJNA13758.WS249.geneIDs.txt.gz really nice (prose) descriptions of genes: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/species/c_elegans/PRJNA13758/annotation/c_elegans.PRJNA13758.WS249.functional_descriptions.txt.gz
orthologs: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/species/c_elegans/PRJNA13758/annotation/c_elegans.PRJNA13758.WS249.orthologs.txt.gz
gene interactions: ftp://ftp.wormbase.org/pub/wormbase/releases/current-development-release/species/c_elegans/PRJNA13758/annotation/c_elegans.PRJNA13758.WS249.gene_interactions.txt.gz