pombase / pombase-chado

PomBase code for accessing Chado
MIT License
5 stars 3 forks source link

quer for jb/so #243

Closed pombase-admin closed 9 years ago

pombase-admin commented 10 years ago

all genes which have conserved in vertebrate (species distribution) AND conserved in cerevisiae (species distribution) AND predominantly single copy (species distribution) AND conserved_unknown (annotation status)

(This list should currently have 104 entries)

Then export a table with

  1. Pombe systematic ID
  2. Pombe gene name 3 pombe product 4 SGD systematic ID 5 human HGNC name

Also the same query/output with further filters i) those with GO mitochondria ii) those with any "disease association' iii) those with GO nucleus

Original comment by: ValWood

pombase-admin commented 10 years ago

I hadn't checked the unknowns for a while so I just removed a number of gene products from the unknowns list as there was info from SGD to make ISO annotations

Committed revision 1329. so the numbers will be slightly different.

Also ISO's every subunit of i) Cdc48p-Npl4p-Vms1p AAA ATPase complex ii) Seh1-associated complex

Original comment by: ValWood

pombase-admin commented 10 years ago

Original comment by: ValWood

pombase-admin commented 10 years ago

I'm a bit stuck on this one. "conserved in cerevisiae" isn't in Chado and I can't see it in the contig files. Can you give me a hint?

Original comment by: kimrutherford

pombase-admin commented 10 years ago

sorry conserved in S. cerevisiae its a 'species distribution'

Original comment by: ValWood

pombase-admin commented 10 years ago

sorry conserved in S. cerevisiae its a 'species distribution'

Sorry, I can't find "conserved in S. cerevisiae" in Chado or in the contig files.

Here are the counts of the number features that have each term from the species_dist cv:

              name                  | count

----------------------------------------+------- conserved in archaea | 237 conserved in bacteria | 1001 conserved in eukaryotes | 4485 conserved in eukaryotes only | 2486 conserved in fungi | 4567 conserved in fungi only | 604 conserved in metazoa | 3418 conserved in Schizosaccharomyces | 1 conserved in Schizosaccharomyces only | 2 conserved in vertebrates | 3395 faster evolving duplicate | 23 identified in S. pombe only | 369 no apparant orthologs | 1 no apparent orthologs | 356 no apparent S. cerevisiae ortholog | 579 orthologs cannot be distinguished | 136 predominantly single copy (one to one) | 3085 sequence orphan | 359 sequence orphan, characterised | 60 sequence orphan, uncharacterised | 299

Original comment by: kimrutherford

pombase-admin commented 10 years ago

Sorry not thinking, you need to do the query vs human (vertebrate) as described and then remove any which have no apparent "S. cerevisiae ortholog" v

Original comment by: ValWood

pombase-admin commented 10 years ago

OK, thanks.

I get 103 genes. Here's the list: https://www.dropbox.com/s/ya1ygf012lg7ow2/conserved_unknown.txt

those with GO mitochondria

There isn't a "mitochondria" term in GO. Which term did you mean?

Also the same query/output with further filters

I'm having some trouble with this. The transitive closure of GO stored in Chado doesn't seem to be right. I'll work on it.

Also ISO's every subunit of i) Cdc48p-Npl4p-Vms1p AAA ATPase complex ii) Seh1-associated complex

What does this mean?

Original comment by: kimrutherford

pombase-admin commented 10 years ago

There isn't a "mitochondria" term in GO. Which term did you mean?

Sorry, I looked into Chado, but I didn't check the synonyms. It's "mitochondrion" in GO.

(It might not help though as the transistive closure stored in Chado appears borked - still working on that.)

Original comment by: kimrutherford

pombase-admin commented 10 years ago

re Also ISO's every subunit of i) Cdc48p-Npl4p-Vms1p AAA ATPase complex ii) Seh1-associated complex

What does this mean?

Ignore this bit. I was only recording that I had done these curation updates, but it was confusing to put it in this ticket.

Also don't worry too much yet about the GO part of the query (although it would be useful to know how to do it).

I'll start with this list and then see if they want any further specific partitioning. Also we can do this in the query tool using the list upload facility with combined queries.

Original comment by: ValWood

pombase-admin commented 10 years ago

Original comment by: ValWood