pombase / website

PomBase website v2
MIT License
6 stars 1 forks source link

Making protein features more visible #2218

Closed ValWood closed 3 weeks ago

ValWood commented 1 month ago
kimrutherford commented 1 month ago

Add a commonly user query for ALL protein features (motifs, signal sequences and TM domains)

Do we want everything with a SO term?: https://www.pombase.org/term/SO:0000001

kimrutherford commented 1 month ago

Do we want everything with a SO term?: https://www.pombase.org/term/SO:0000001

Probably polypeptide_region (SO:0000839) would be a better term?

I've added the commonly used query as "Genes annotated with protein features". Let me know if you can think of a better wording.

kimrutherford commented 1 month ago

Change tab label in query builder to "protein features" (from protein motifs)

Done

Add TM domains to the Examples help text (under search box)

Done. I added "TM domains, " but if you search for that string it doesn't found anything useful. Maybe we should change the text to "transmembrane helix"?

ValWood commented 1 month ago

We only use SO in this way for protein features so I think polypeptide_region (SO:0000839) would be best

ValWood commented 1 month ago

the list from the query is likt this:

https://www.pombase.org/results/from/id/384bce14-b710-48a0-9dbf-cfa3f18f1b82

can we make it like this? https://www.pombase.org/term_genes/SO:0000839

so that people can access the subclasses?

kimrutherford commented 1 month ago

can we make it like this? https://www.pombase.org/term_genes/SO:0000839

Linking to that page from the commonly used queries will take a bit of work because it's not a standard results page.

But if we add the SO term ID to the query description, it will be clickable link to that page:

image

It that enough?

ValWood commented 1 month ago

yes that would do just fine!

ValWood commented 1 month ago

Actually linking directly to https://www.pombase.org/term/SO:0000839 would be even better (i.e shows the subsets)

kimrutherford commented 1 month ago

Actually linking directly to https://www.pombase.org/term/SO:0000839 would be even better (i.e shows the subsets)

Perhaps we could have that link in a different list? It feels wrong to have it in the list of commonly used queries since it's not a query - it's just a link to a term page.

ValWood commented 1 month ago

That's true. Let's chat about it on Tuesday.

ValWood commented 3 weeks ago

Thoughts

  1. I'm not sure how useful this list is if it can't be used to access the subsets, so maybe we don't require this (because its such a heterogenous bin)"commonly query".

resolve by removing "transmembrane domain" prompts from the SO search

ValWood commented 3 weeks ago

OK (Re point 5 and 6) we used mitochondrial_targeting_signal (SO:0001808) because we thought it would be a more meaningful label so the prompt would need to be "mitochondrial targeting signal"

kimrutherford commented 3 weeks ago

helix (I don't think helix is in here)

There are some transmembrane_helix annotations: https://www.pombase.org/term/SO:0001812

kimrutherford commented 3 weeks ago

It seems slightly odd to have 2 separate ways to search fro TM domains (under TM domains or with the SO term),

I agree. Searching with a SO term name is quite obscure so I'm not sure it will be a problem.

kimrutherford commented 3 weeks ago

We have "TM domains" and "helix" separately in the help text. Perhaps we can combine those to just "TM helix"?

kimrutherford commented 3 weeks ago

should "signal sequence" in the prompts be "signal peptide" (name)

Good point. I've fixed that. I'll re-release the site soon.

kimrutherford commented 3 weeks ago

OK (Re point 5 and 6) we used mitochondrial_targeting_signal (SO:0001808) because we thought it would be a more meaningful label so the prompt would need to be "mitochondrial targeting signal"

True!

It would be helpful if "transit peptide" was a synonym, but SO:0001808 only has:

synonym: "mitochondrial signal sequence" EXACT [] 
synonym: "mitochondrial targeting signal" EXACT []
synonym: "MTS" EXACT []

I've changed "signal peptide" to "mitochondrial targeting signal" in the help text. The change will be on the main site soon.

kimrutherford commented 3 weeks ago

I've changed "signal peptide" to "mitochondrial targeting signal" in the help text.

Sorry, that nonsense. I have a bit of a cold and it has made my brain mushy.

I've restored "signal peptide" to the help text and added "mitochondrial targeting signal".

ValWood commented 3 weeks ago

helix (I don't think helix is in here)

Just make sure this "helix" refers to "TM helix" nd not to "helix" as in protein secondary structure

ValWood commented 3 weeks ago

We have "TM domains" and "helix" separately in the help text. Perhaps we can combine those to just "TM helix"?

I think I prefer to say "transmembrane" in full here for the label (as I expect most people will search on this"

Are all transmembrane domains helices (I have no idea!)

ValWood commented 3 weeks ago

Chat GPT No, not all transmembrane (TM) domains are helices, although many are. Transmembrane domains can adopt different structures depending on the type of protein, its function, and the environment. The two most common types of TM domain structures are:

Alpha helices: The most prevalent structure in transmembrane proteins found in the lipid bilayer, especially in eukaryotic cells, is the alpha helix. Alpha-helical TM domains are typical for single-pass or multi-pass membrane proteins, such as G protein-coupled receptors (GPCRs), ion channels, and transporters.

Beta-barrels: Some transmembrane proteins, particularly in the outer membranes of Gram-negative bacteria, mitochondria, and chloroplasts, form beta-barrels. These proteins are made up of beta-strands that come together to create a barrel-shaped structure. Examples include porins and some transporters in the outer membrane.

So, while alpha helices are very common, TM domains can also consist of beta-strands, or even less common structures depending on the protein’s nature.

ValWood commented 3 weeks ago

so "transmembrane domain" (I don't think we need to distinguish the subtype at this juncture - but if we can import additional non redundant transmembrane domains from Uniprot we should (they should probably be completely non overlapping with the existing)

ValWood commented 3 weeks ago

I agree. Searching with a SO term name is quite obscure so I'm not sure it will be a problem.

I agree. I think we could actually omit mention of "transmembrane domains" from the help text for the SO search. If people find them this way, that's OK, but the default should be the TM query tool which allows the additional functionality of TM domain number.

ValWood commented 3 weeks ago

Just make sure this "helix" refers to "TM helix" nd not to "helix" as in protein secondary structure

ignore this comment. I forgot it was referring to the search text

kimrutherford commented 3 weeks ago

I think we could actually omit mention of "transmembrane domains" from the help text for the SO search.

That makes sense. I'll do that now.

kimrutherford commented 3 weeks ago

I'll do that now.

After the change it's:

Examples: short motifs such as NLS or KEN box, signal peptide, mitochondrial targeting signal,
cleaved region

Are there any other important features we could mention? There's plenty of space if we need a longer list.

kimrutherford commented 3 weeks ago

That change is on pombase.org now. Can we close this?

ValWood commented 3 weeks ago

Yep, thanks!