Closed pgaudet closed 3 years ago
I can add a lit of all of the transport terms that PomBase block for direct annotation.... That might be a useful start.
That would be helpful.
Thanks, Pascale
All sounds good. One note of caution though: May be worth looking into longer range transport processes that are relevant in metazoans but not in unicellular organisms. Are there cases where we might want to use a general class to group long-range transport in a metazoan organism e.g. - transcytosis across gut epithelium or extracellular factors that affect long range transport/diffusion of a ligand?
"extracellular factors that affect long range transport/diffusion of a ligand"
Is that really transport? (i.e directed movement) Diffusion isn't transport. It should be possible to pin down the parts of these processes that really are transport (directed movement from A to B), otherwise they should be annotated to "localization to x"
Sorry forgot to actually do it.
Here is our list of transport terms where it should always be possible to annotate more specifically. At least it has for us so far. transport_terms_not_specific_enough.txt
GO:0015755 fructose transport - OK renamed TM
GO:0015757 galactose transmembrane transport OK RENAMED TM
GO:0008645 hexose transport Merged with TM transport
GO:0015755 fructose transport OK RENAMED TM
GO:0015758 glucose transport Merged with TM transport GO:1904659
GO:0015756 fucose transport OK RENAMED TM
GO:0015754 allose transport OK RENAMED TM
GO:0015761 mannose transport OK RENAMED TM
GO:0015762. rhamnose transport OK RENAMED TM
GO:0015735 uronic acid transport OK RENAMED TM
GO:0015736 hexuronate transport OK RENAMED TM
GO:0015737 galacturonate transport OK RENAMED TM
GO:0015738 glucuronate transport OK RENAMED TM
GO:0042874 D-glucuronate transport OK RENAMED TM
GO:0015750 pentose transport OK RENAMED TM
GO:0015751 arabinose transport OK RENAMED TM
GO:0015752 D-ribose transport OK RENAMED TM
GO:0015753 D-xylose transport OK RENAMED TM
GO:0042899 arabinan transport OK RENAMED TM
GO:0042882 L-arabinose transport OK RENAMED TM
GO:0046411 2-keto-3-deoxygluconate transport OK RENAMED TM
GO:0015749 monosaccharide transport OK merged in to TM GO:1905950
GO:0015792 arabinitol transport OK RENAMED TM
GO:0042869 aldarate transport OK RENAMED TM
GO:0042870 D-glucarate transport OK RENAMED TM
GO:1902300 galactarate transport OK RENAMED TM
GO:0015745 tartrate transport OK RENAMED TM
GO:0015725 gluconate transport Merge into TM GO:0035429
GO:0042875 D-galactonate transport OK RENAMED TM
GO:0015726 L-idonate transport OK RENAMED TM
GO:0015713 phosphoglycerate transport OK RENAMED TM
GO:0042873 aldonate transport OK RENAMED TM
GO:0015733 shikimate transport OK RENAMED TM
GO:0006848 pyruvate transport merge into GO:1901475 pyruvate transmembrane transport
GO:0008643 carbohydrate transport
GO:0000101 sulfur amino acid transport
GO:0000296 spermine transport
GO:0006811 ion transport
GO:0006812 cation transport
GO:0006813 potassium ion transport
GO:0006814 sodium ion transport
GO:0006816 calcium ion transport
GO:0006820 anion transport
GO:0006826 iron ion transport
GO:0006829 zinc ion transport
GO:0006830 high-affinity zinc ion transport
GO:0006831 low-affinity zinc ion transport
GO:0006862 nucleotide transport
GO:0006865 amino acid transport
GO:0006867 asparagine transport
GO:0006868 glutamine transport
GO:0015672 monovalent inorganic cation transport
GO:0015695 organic cation transport
GO:0015696 ammonium transport
GO:0015697 quaternary ammonium group transport
GO:0015698 inorganic anion transport
GO:0015711 organic anion transport (transmembrane?)
GO:0015718 monocarboxylic acid transport
GO:0015748 organophosphate ester transport
GO:0015780 nucleotide-sugar transport (transmembrane?)
GO:0015781 pyrimidine nucleotide-sugar transport
GO:0015788 UDP-N-acetylglucosamine transport
GO:0015801 aromatic amino acid transport
GO:0015802 basic amino acid transport
GO:0015803 branched-chain amino acid transport
GO:0015804 neutral amino acid transport
GO:0015807 L-amino acid transport
GO:0015809 arginine transport
GO:0015816 glycine transport
GO:0015817 histidine transport
GO:0015818 isoleucine transport
GO:0015819 lysine transport
GO:0015820 leucine transport
GO:0015821 methionine transport
GO:0015822 ornithine transport
GO:0015823 phenylalanine transport
GO:0015824 proline transport
GO:0015826 threonine transport
GO:0015827 tryptophan transport
GO:0015828 tyrosine transport
GO:0015829 valine transport
GO:0015838 amino-acid betaine transport
GO:0015844 monoamine transport
GO:0015846 polyamine transport
GO:0015848 spermidine transport
GO:0015849 organic acid transport
GO:0015850 organic hydroxy compound transport
GO:0015857 uracil transport
GO:0015879 carnitine transport
GO:0015904 tetracycline transport
GO:0015931 nucleobase-containing compound transport
GO:0015986 ATP synthesis coupled proton transport
GO:0030001 metal ion transport
GO:0031460 glycine betaine transport
GO:0031919 vitamin B6 transport
GO:0015882 L-ascorbic acid transport OK RENAMED TM
GO:0031920 pyridoxal transport
GO:0031922 pyridoxamine transport
GO:0031923 pyridoxine transport
GO:0032328 alanine transport
GO:0032329 serine transport
GO:0042883 cysteine transport
GO:0042886 amide transport
GO:0045117 azole transport
GO:0046942 carboxylic acid transport
GO:0051608 histamine transport
GO:0051937 catecholamine transport
GO:0060402 calcium ion transport into cytosol
GO:0071582 negative regulation of zinc ion transport
GO:0071705 nitrogen compound transport
GO:0071804 cellular potassium ion transport
GO:0072337 modified amino acid transport
GO:0072348 sulfur compound transport
GO:0072531 pyrimidine-containing compound transmembrane transport GO:0090481 pyrimidine nucleotide-sugar transmembrane transport GO:1900751 4-(trimethylammonio)butanoate transport GO:1901374 acetate ester transport GO:1901679 nucleotide transmembrane transport GO:1902023 L-arginine transport GO:1902024 L-histidine transport GO:1901264 carbohydrate derivative transport GO:0071702 organic substance transport
??? GO:0046907 intracellular transport GO:1902495 transmembrane transporter complex GO:1902582 single-organism intracellular transport GO:0044765 single-organism transport GO:0005215 transporter activity GO:0000041 transition metal ion transport GO:0051028 mRNA transport GO:0016482 cytoplasmic transport GO:0060401 cytosolic calcium ion transport GO:0070838 divalent metal ion transport GO:0072511 divalent inorganic cation transport GO:0072512 trivalent inorganic cation transport GO:0051180 vitamin transport
Iron GO:0015684 ferrous iron transport
possible neurotransmitters GO:0015870 acetylcholine transport GO:0015871 choline transport GO:0015872 dopamine transport
Regulation
GO:1903651 positive regulation of cytoplasmic transport
GO:1903789 regulation of amino acid transmembrane transport
GO:0032386 regulation of intracellular transport
GO:0032388 positive regulation of intracellular transport
GO:0034762 regulation of transmembrane transport
GO:0071579 regulation of zinc ion transport
GO:0043269 regulation of ion transport
GO:0043270 positive regulation of ion transport
GO:0051924 regulation of calcium ion transport
GO:0051928 positive regulation of calcium ion transport
GO:1903649 regulation of cytoplasmic transport
GO:0051049 regulation of transport
GO:0051050 positive regulation of transport
GO:0051051 negative regulation of transport
GO:0010522 regulation of calcium ion transport into cytosol
GO:0010959 regulation of metal ion transport
Hi sorry what is the plan here?
Using QuickGO I have identified that there are 2000 manual expt annotations to these terms
UniProt | 17.79 | 396 MGI | 17.21 | 383 RGD | 11.77 | 262 ZFIN | 10.20 | 227 SGD | 10.02 | 223 FlyBase | 8.67 | 193 TAIR | 3.82 | 85 EcoCyc | 3.46 | 77 WormBase | 3.41 | 76 PseudoCAP | 2.88 | 64
I think you need to let all these groups know what you are proposing. Also I think these general terms might be useful when used by InterPro etc, because there may not be a specific substrate for all members of a protein family might not be. note there are 4 million annotations directly using these terms. This is confirmed by the statistics for these terms which confirms 61% of annotations using these exact terms are provided by Interpro:
InterPro | 61.54 | 2,465,983 UniProt | 35.17 | 1,409,305 GOC | 2.52 | 100,833 Ensembl | 0.26 | 10,398 EnsemblFungi | 0.18 | 7,346 GO_Central | 0.18 | 7,315 EnsemblMetazoa | 0.03 | 1,115 RGD | 0.02 | 990 EnsemblPlants | 0.02 | 846 MGI | 0.02 | 661
Thanks
Ruth
Hi Ruth,
We are saying that most of these should really be transmembrane transport (for example glucose). They will not get a 'do not annotate' tag (at least that's not the plan at this point).
OK ?
Are there no cases where these things are transported across tissues or around organisms? These merges seem a bit dangerous.
Across organisms I haven't seen.
I have come across intestinal absorption and transepithelial transport, these have not been put under transmembrane. OK ? I'll post a comment if I find anything that looks different - so far all transport I looked at was mediated by normal TM transporters.
Thanks, Pascale
I also think that for each one you need to be sure that there is no case where there is endocytosis so that the transport is not across the membrane. As mentioned before iron transport is not always (ever?) transmembrane therefore
GO:0072511 divalent inorganic cation transport GO:0072512 trivalent inorganic cation transport
for example should not be merged with transmembrane transport
In future please make sure someone from each group contributing annotations that will be affected is included on the ticket. In this case UCL has no direct annotations to the terms listed, however we do have annotations to the child terms.
There is such a long list of terms above, who is going to be responsible for checking that for each term there is not an endocytosis mechanism?
Ruth
Hi @RLovering Thanks for the feedback. Here's what I do: For each term:
As mentioned in the previous ticket, I am not touching iron and lipids for now.
Thanks, Pascale
Look closely at the kidney and maybe transport into and out of the nervous system.
Thanks Pascale
but how are we to know which ones you are doing? the terms I have listed above are in your full list. but according to you they were in the category for merging.
so why are they in the list?
Ruth
Hi @RLovering You're right, I don't know a priori which terms are excluded. I've been adding them at the bottom of the list (the part that starts with ??). I'm doing the straight forward ones first.
@pgaudet this might be done. Maybe you can close and just request any more as we see them (which is what I have been doing)
I agree more works need to be done but this ticket is not really useful to describe it.
As a follow up on this:
https://github.com/geneontology/go-annotation/issues/1565
@ValWood writes:
Many substrate transport terms exist, with a non-transport type specific term, and a transmembrane transport term, even if transmembrane transport is the only known mechanism of transport.
So, for example here we have "caroxylic acid transport" and "tricarboxylic acid transport" even though these are transported only by tansmembrane transport as far as we know? This occurs right through the transport branch. Many curators annotate to these none-specific terms like "cation transport" "glucose transport" and not to the appropriate "transmembrane transport" term.
One solution would be to review and, merge these "redundant terms" A better way to handle this might be to make all of the non-mechanism specific terms "not for direct annotation". This would force curators to ensure they are selecting the correct and most specific terms, which would improve annotation. In a while a merge could be considered if no problems were encountered (i.e non-transmembrane type transport of such a molecule exists)
We have an internal rule that for transport it should always specify the transport type, and we have a few "sterol transport" annotations which break this rule. To fix this we would require a BP term for "carrier-type transport in aqueous environments"
@dosumis writes:
That sounds sensible to me. The issue you describe is really an annotation issue - rather than any inconsistency in the way the ontology is built. The general import/export terms that don't specify a location have a similar issue.
Seems reasonable to request, as long as you can make a case for other proteins being involved besides the carrier (pretty sure you can in this case).