pombase / website

PomBase website v2
MIT License
6 stars 1 forks source link

transcript section - gene page #1517

Closed mah11 closed 4 years ago

mah11 commented 4 years ago

At long last we have thought of good reasons to add a transcript section back to gene pages - there's nowhere else to show:

also include transcript type (mRNA, ncRNA … any others?) even though it's also at top of page

@kimrutherford will future-proof it to support multiple transcripts

also some discussion in chat to the effect that protein features will probably remain associated only with main version of transcript; can rethink later (which is good because I don't think I caught all the details)

successor to #350

kimrutherford commented 4 years ago

I've started this by getting the information into a new section on the gene page and making a simple diagram. I need suggestions for how to format things, especially all the coordinates.

Here's what I've got on my desktop: screen03

Live version: http://desktop.kmr.nz:4242/gene/SPCC285.08

ValWood commented 4 years ago

@JackyVH @bahler

Hi Jurg, Jacky, could you ask your teams and provide any input (suggestions/requests) for this feature. (I will stage Juan too once I have a Git handle)

ValWood commented 4 years ago

@juanmatacambridge

kimrutherford commented 4 years ago

Thanks Val. I've implemented some of those suggestions. Now if you mouse-over part of the diagram it highlights the coordinates below, and vice versa. I'm still working on the rest of your list.

Here's the version on my desktop (until 1pm or so): http://desktop.kmr.nz:4242/gene/SPAC1296.04

transcript-section-3

It doesn't look so good for transcripts with lots of exons: http://desktop.kmr.nz:4242/gene/SPAC22F3.03c

transcript-section-4

kimrutherford commented 4 years ago

Perhaps we do not need to specify the intron coordinates if we provide the exon coordinates.

If it helps, we could format the table like this (but with transcript coordinates):

transcript-section-5

ValWood commented 4 years ago

I was imagining "exon number" aligned underneath the image, it doesn't really work on the exon itself. However, I don't think numbering the exons is necessary now because the number is clear from the reciprocal mouse over (which is great)

I like the 2 column version for exons and introns. Much easier to scan to see the actual feature coordinates, but intron coordinates are there if you really want them, and it will be aesthetically more pleasing.

kimrutherford commented 4 years ago

I was imagining "exon number" aligned underneath the image,

I think that will be tricky, especially for transcripts with lots of exons.

However, I don't think numbering the exons is necessary now because the number is clear from the reciprocal mouse over

Great, I'll remove the numbering.

I like the 2 column version for exons and introns.

I'll implement that. The screenshot was only a mock-up.

kimrutherford commented 4 years ago

Great, I'll remove the numbering.

I'll implement that. The screenshot was only a mock-up.

Both of those are done now: http://desktop.kmr.nz:4242/gene/SPCC18B5.06

screen04

juanmatacambridge commented 4 years ago

It looks great but I'm a bit confused - shouldn't exon 1 start @ 729054?

ValWood commented 4 years ago

we should label as "coding exon"

ValWood commented 4 years ago

Should we display genomic coordinates or transcript coordinates by default?

ValWood commented 4 years ago

we should label as "coding exon" or would that be weird... because the 5' UTR is still part of the coding exon. But we need some way to refer to the CDS part of an exon split between a UTR and a coding region...

kimrutherford commented 4 years ago

It looks great but I'm a bit confused - shouldn't exon 1 start @ 729054?

I think you're right.

we should label as "coding exon" or would that be weird... because the 5' UTR is still part of the coding exon.

We should chat about exactly how to label things. We also need to handle cases like SPAC1039.01 where there are introns in a UTR.

Should we display genomic coordinates or transcript coordinates by default?

I vote for the genomic coordinates as I think they are more useful.

juanmatacambridge commented 4 years ago

Can't we label exons and introns independently of where the UTRs and CDS are, and then label 5'-UTR and 3'-UTR and CDS? The exons should be whatever is left in the mature mRNA, irrespective of whether they contain coding or non-coding sequences (for example, SPAC1039.01. contains non-coding exons). In the case of SPAC1039.01.1 the first exon would be 5444360-5444400, exon 2 would be 5444448-5445123 , etc I agree with usin genomic coordinates by default

kimrutherford commented 4 years ago

It looks great but I'm a bit confused - shouldn't exon 1 start @ 729054?

After a bit of discussion we decided the quick temporary fix is to explain things better. So now the exons are labelled "exon CDS" and a slightly longer text pops up to help. We'll work on this because it would be nice to show the coordinates of the coding and non-coding parts of the exons.

screen06

Should we display genomic coordinates or transcript coordinates by default?

We now display the genomic coordinates by default with a checkbox to change to transcript coordinates:

screen07

@kimrutherford will future-proof it to support multiple transcripts

It should work OK with multiple transcripts but we might want to reformat things a bit. We should revisit when there are multi-transcript genes in the database.

kimrutherford commented 4 years ago

I forgot to say that this is still only in my local version of the site: http://desktop.kmr.nz:4242/gene/SPAC13A11.04c

juanmatacambridge commented 4 years ago

I like the checkbox to change the coordinates.
It would be good to try it for more complex transcripts - maybe rem1, which shows intron retention in early meiosis?

kimrutherford commented 4 years ago

It would be good to try it for more complex transcripts - maybe rem1, which shows intron retention in early meiosis?

The display for rem1 isn't very interesting at the moment: transcript-diagram-1

juanmatacambridge commented 4 years ago

I don't think it's been annotated - the first intron is spliced specifically in late meiosis

ValWood commented 4 years ago

We would not show annotate intron retention here, this is just the gene structure.

Later we will annotate alternative transcripts, but not intron retention as a regulation mechanism. At least that is not something that is on our radar for the near future.

juanmatacambridge commented 4 years ago

This one is not just intron retention as regulatory mechanism, because the spliced and unspliced forms have separate functions. But I agree we may not want to deal with them here

ValWood commented 4 years ago

OK I can add that one to rem1 to our list of alternative transcripts to describe.

ValWood commented 4 years ago

I added rem1 to a table of known alternative transcripts to test the system on" https://github.com/pombase/curation/issues/61 do you have a PMID?

kimrutherford commented 4 years ago

Still needed here:

Anything else?

ValWood commented 4 years ago

I can't remember what it looks like and your test server is down.

Do we have the transcript coordinates on mouse-over or did people feel that was not important?

I'm happy for it to go live when the 3 above are done anyway because it is useful as it is.

mah11 commented 4 years ago

ping me on #1582 if it's ready for documenation (will be nice to have a screenshot for it)

kimrutherford commented 4 years ago

The test server is up again now.

Do we have the transcript coordinates on mouse-over or did people feel that was not important?

I've added a "Show transcript coords" checkbox to toggle the coordinates.

I'm happy for it to go live when the 3 above are done anyway because it is useful as it is.

OK, I'll work on 1 and 2 and then let Midori know when it's ready for a screenshot and documentation.

mah11 commented 4 years ago

That looks nice!

juanmatacambridge commented 4 years ago

Agree!

On 13/08/2020 11:48, Midori Harris wrote:

That looks nice!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pombase/website/issues/1517#issuecomment-673405763, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQP6IYDSCVKCC7KBPSQR6BLSAPAGBANCNFSM4M3HHKGQ.

-- Dr Juan Mata Deputy Head of Department (Postgraduate Education and Training) University Senior Lecturer Department of Biochemistry University of Cambridge Hopkins Building Cambridge CB2 1QW

ValWood commented 4 years ago

very nice!

kimrutherford commented 4 years ago

start and end coordinates on the figure

Having tried this I'm not sure it helps. The start and end are in the "Location:" bit just above the graphic. It looks a bit redundant having the start and end a second time. Does anyone have a strong opinion about this? Here's how it looks before and after:

transcript-diagram-start-end-0

transcript-diagram-start-end-1

ValWood commented 4 years ago

yes agreed, it's easy to see in the table. Sorry about that...

kimrutherford commented 4 years ago

yes agreed, it's easy to see in the table. Sorry about that...

No problem. I'm working on adding a key and then it's probably ready for screenshots and docs.

kimrutherford commented 4 years ago

I'm working on adding a key and then it's probably ready for screenshots and docs.

I've added a key and moved the coordinate system toggle button to the top right:

transcript-diagram-key-1

Please let me know if there are any changes I should make before doing screenshots.

ValWood commented 4 years ago

looks good, again with the explicit labelling, probably not even necessary?

mah11 commented 4 years ago

Not sure I understand this from Val

again with the explicit labelling, probably not even necessary?

... but I think it looks fine now and will go ahead with documentation.

mah11 commented 4 years ago

p.s. I like the new toggle button placement

kimrutherford commented 4 years ago

... but I think it looks fine now and will go ahead with documentation.

Thanks.

p.s. I like the new toggle button placement

Cheers. I thought it made sense to put it there as future-proofing for when we have multiple transcripts.

mah11 commented 4 years ago

OK, I've committed some documentation.

kimrutherford commented 4 years ago

OK, I've committed some documentation.

Thanks. The transcript section will be visible on the main site in the morning.

I'm going to close this issue. We can open new issues for any problems.