Closed Antonialock closed 4 years ago
will assign to @kimrutherford Kim once track labelling sorted.
From Dan: "Also, Ive realised that I would like to add another track (Hermes HMM-defined elements), which is a standard chr,start,end,name,score type so could be in bed or bigBed format."
Note to self: could call this type sof data transposon mutagenesis
Track data description upload format - Jeffares 2018(1).xlsx
@kimrutherford do you have a preference bigbed or bed? (I'm guessing bed since we have other datasets in that format?)
Dan didn't you also mention "and nucleosome density (From Maria)"?
Yes I have the nucleosome density as bigbed. /Dan
ah I see, cheers. we can give it a shot but might be a bit trickier than the formats we have already tried our hands at - we are on a learning curve :-)
@kimrutherford do you have a preference bigbed or bed? (I'm guessing bed since we have other datasets in that format?)
JBrowse supports BED and bigBed formats so I think either would be fine.
I can't see this data here yet: ftp://ftp.pombase.org/external_datasets/
does anything else need doing?
Would you mind if I have a look at the manuscript Dan?
I can't see this data here yet: ftp://ftp.pombase.org/external_datasets/ does anything else need doing?
Sorry, I've lost track of where we're at. What are the file names?
Dan said he had bigwig files he could send and also pointed us to here: http://bahlerweb.cs.ucl.ac.uk/bioda/ (I never managed to view the data on the Bahler page in any of my browsers)
You said you had a quick look and that it looked straightforward because " had a quick look and I think this would be easy to add. We can handle bigWig files and the files mostly use the chromosome IDs we need (I, II, III, etc)."
and you said you had a quick look and it looked easy :p
I’m happy to send the bigwig files, or links to them.
Cheers, Dan
Either would be great Dan! Would you mind if I have a look at the manuscript?
Just one request: could you make sure that wherever a chromosome is specified, that this is done in the format "I" "II" or "III"?
So looks like 5 files in total
Hermes transposon insertions Hemes HMM state Hermes HMM-defined elements Nucleosome positioning in WT Conservation (phyloP)
Thanks!
Just one request: could you make sure that wherever a chromosome is specified, that this is done in the format "I" "II" or "III"?
We should make another announcement in a month or so specifying all of the things that providers should do to make this hosting easier. I'll start a list on the website tracker.
@djeffares - Kim just reminded me there's more to the genome than just the chromosomes. chromosome IDs: I, I, III, mating_type_region, mitochondrial and chr_II_telomeric_gap
I’m happy to send the bigwig files, or links to them.
Hi Dan. Links would be great. Either bigwig or wig format is OK.
Hello @Antonialock and @kimrutherford,
I'd like to reopen this ticket, as the paper is now accepted and on early access at MBE: https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msz113/5488193
I now have 5 tracks to provide:
PhyloP conservation estimates from the four Schizosacharomyes species. File: Conservation-SchizPom_phyloP.bigWig
Nucleosome density data from log phase cells File: Nucleosome-density-wtNucWave-reps-median.depth_wl_trimmed_PE2.bigWig
Hermes transposon insertion counts from log phase cells File: ermes-all-log.counts-incl-VT.2016-10-03.txt.bigWig
HMM states from log phase insertions, reflecting the importance of each position in the genome File: all-log-data.hmmstate.model5A.bigWig
HMM-defined elements (HDEs), representing functional units File: hermes-log.av.mapping.ratio0.9.states.stateblocks.100ntlength.bed
I have made a tar file available on google drive that contains all these files: https://drive.google.com/file/d/1bWqy8luMLY71thv-51JSldpUNzUHFnlu/view?usp=sharing
All these bigWig files should function, as they do on the Bioldalliance browser that Danny Bitton set up: http://bahlerweb.cs.ucl.ac.uk/bioda/
If you navigate to position I:198,640..290,753 this will display (on Firefox, and perhaps Chrome).
best wishes Dan
Wonderful news, congratulations!
@kimrutherford do the files need tweaking?
Dan could you also provide track metadata? it is described in here:
Track data description upload format.xlsx
columns A-Q (some obviously not applicable, I can add the PMID once it has one...)
Hi Dan. Thanks for the files. I've grabbed a copy.
@kimrutherford do the files need tweaking?
I need to change some of the chromosome IDs. I'll do that sometime next week.
Hi @kimrutherford
Any chance you'd found time to look at this?
I'm asking because the paper is now out, and the tweet is getting some likes, so people may want to browse the data. https://twitter.com/danieljeffares/status/1141006963112890369
Happy to help reformat files, if need be.
cheers Dan
Hi Dan.
Sorry, I haven't got to that. I'll have a look on Monday. I'll let you know if I have any questions.
Any chance you'd found time to look at this?
Hi Dan.
I've had a look at the files. I needed to tweak the chromosome IDs in some cases (eg. change "MT" to "mitochondrial") to match what JBrowse expects but they are ready to go now.
Did you see Antonia's comment about the metadata?: https://github.com/pombase/website/issues/767#issuecomment-497456515 I think an extra column has been inserted in the examples in the track description spreadsheet Antonia's attached to that comment. Here's a fixed version: Track.data.description.upload.format.xlsx
Cheers!
Hi @kimrutherford
Is this metadata file OK?
Jeffares-2019-track.data.description.upload.format xlsx.xlsx
Thanks Dan. That looks good. I'll try to get the tracks into the browser today.
This label might be a bit too long: Conservation level estimated using phyloP method, from Cactus alignment of S. pombe, S. japonicus, S. octosporus and S. cryophilus genomes (Grech 2019)
Can we shorten it?
Thanks Kim,
How about:
Conservation, estimated using phyloP from alignment of four Schizosaccharomyces genomes (Grech 2019)
Best wishes, Dan
How about: Conservation, estimated using phyloP from alignment of four Schizosaccharomyces genomes (Grech 2019)
That's great.
I've had to change the commas to semicolons because JBrowse doesn't support commas in track labels. I'd forgotten that. I'm happy to change any labels if they look too naff with the semicolons.
The new tracks are visible now: new tracks
Just to check, the "Nucleosome density from exponentially growing wild-type" row has a different PubMed ID from the other 4. Was that on purpose?
Conservation estimated from (the) alignment of four Schizosaccharomyces genomes using phyloP
and doesn't require a comma or colon
We can announce this (i'm not sure if Antonia is announcing browser hosting in batches though)
We could also add a "research spotlight" to the front page if you have a suitable image Dan,
Here's what it looks like in the region around cdc2, at Spotlight image resolution:
Hi Kim and Val,
This image looks great to me. Clearly fewer transposon insertions in the gene (and even fewer in the antisense ncRNA), different HMM state sin the UTRs, and higher conservation (phyloP) in the coding axons.
I’d be happy to us Ethiopian as a research spotlight. Thanks!
Best wishes, Dan
Is
Research spotlight: Grech et al., 2019 The fitness Landscape of the fission yeast genome. Published in Mol Biol Evol. PMID: 31077324
OK?
Ethiopian? @djeffares
Research spotlight: Grech et al., 2019
I've added that to the configuration. It's won't be visible until tomorrow morning so there's still time to tweak things.
I haven't included the usual "Publication record in PomBase ..." link because there isn't a publication page for PMID:31077324. Is that expected?
There's room for a longer text if you like. Now I've added the config I realise that a link to JBrowse makes sense. I'll add that.
It will look like this when it appears on the website (plus a JBrowse link once I add that):
Now I've added the config I realise that a link to JBrowse makes sense. I'll add that.
That's done. Let me know if you'd like any wording changes.
Looks great, thanks Kim and Val.
On Wed, 3 Jul 2019, 05:56 Kim Rutherford, notifications@github.com wrote:
Research spotlight: Grech et al., 2019
I've added that to the configuration. It's won't be visible until tomorrow morning so there's still time to tweak things.
I haven't included the usual "Publication record in PomBase ..." link because there isn't a publication page for PMID:31077324. Is that expected?
There's room for a longer text if you like. Now I've added the config I realise that a link to JBrowse makes sense. I'll add that.
It will look like this when it appears on the website (plus a JBrowse link once I add that):
[image: grech-spotlight-image-2] https://user-images.githubusercontent.com/90474/60577929-6be04c00-9dd4-11e9-91ff-d0f000e12fb7.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pombase/website/issues/767?email_source=notifications&email_token=AD2HHSMZJIP46WRIWFQ2YWTP5RSURA5CNFSM4FA6VHOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZDYXCI#issuecomment-508005257, or mute the thread https://github.com/notifications/unsubscribe-auth/AD2HHSK5J3KCIKJ64QRUYHLP5RSURANCNFSM4FA6VHOA .
Looking at the browser now, I see that the HMM track description is unhelpful.
Could this be altered to the text below?
HMM fitness model (more important regions have lower scores).
Thanks again for displaying and highlighting this.
PS: 'Ethiopian' was a autocorrect typo, amongst many in that message. 🤔
No problem. There are two descriptions that mention HMM. Which is the one to change?:
Excellent!
I have a few questions/comments on the track descriptions.
To clarify what "they are about":
There are 4 tracks:
this is what they look like in the browser:
Secondly, I'm not keen on having 3 new "data types" (transposon insertions, HMM state blocks, transposon HMM state) - can we fit these in under "transposon insertions" and tweak the track descriptions?
New track descriptions: Track 1: Transposon insertion sites (Grech 2019) Track 2: Transposon insertion sites smoothened to states S1-S5 (Grech 2019) Track 3: (needs elaborating so that it is obvious what it is showing) Conservation of S. pombe fitness consequence states S1-S5 in S. octosporus, S. japonicus, and S. cryophilus (Grech 2019) track 4: HDE units, continuous runs of states S1-S3 (Grech 2019)
I suggest specification of "assay type" (not great terminology for modelling.. perhaps "method" would be better?) as follows:
Track 1: Hermes Track 2: HMM Track 3: PhyloP Track 4: HMM
I also suggest to move the "sample ID" into "study ID" column ? (it looks like a study ID not a sample ID?)
Is it really applicable to both tracks 1 and 4 - should it only be added to track 1?
cheers! @djeffares
also @kimrutherford can we show the labels for the "HMM-derived elements (HDEs); genome windows with runs of one HMM state" track by default?
Secondly, I'm not keen on having 3 new "data types" (transposon insertions, HMM state blocks, transposon HMM state) - can we fit these in under "transposon insertions" and tweak the track descriptions?
I agree- we need the "types" to be broader groupings, and to limit the number. This specificity should be in the description.
also @kimrutherford can we show the labels for the "HMM-derived elements (HDEs); genome windows with runs of one HMM state" track by default?
The labels are on but they only show when you zoom in:
Ah.. I did see those but because of the coordinates I didn’t see (what I thought was the) interesting bit - whether the feature is S1, S2, S3 - didn't see the forest for the trees - the coordinates are available when you click on a feature (see screenshot below), perhaps it is better to keep the label simple?
It will look like this when it appears on the website (plus a JBrowse link once I add that):
It's on the main site now and is one of the Spotlights that will be shown on the front page: https://www.pombase.org/archive/spotlight
Hi Kim,
Please adjust the first one:
HMM state generated from transposon insertion data (Grech 2019)
Best wishes, Dan
On 3 Jul 2019, at 07:43, Kim Rutherford notifications@github.com wrote:
No problem. There are two descriptions that mention HMM. Which is the one to change?:
HMM state generated from transposon insertion data (Grech 2019) HMM-derived elements (HDEs); genome windows with runs of one HMM state (Grech 2019) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pombase/website/issues/767?email_source=notifications&email_token=AD2HHSPXDGQZHGEYDOQ3FLDP5R7FXA5CNFSM4FA6VHOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZEBSGY#issuecomment-508041499, or mute the thread https://github.com/notifications/unsubscribe-auth/AD2HHSOF5LPZTI3G6WTLPNLP5R7FXANCNFSM4FA6VHOA.
Could this be altered to the text below? HMM fitness model (more important regions have lower scores).
Hi Dan.
I've made that change. It will on pombase.org on Saturday morning.
Antonia has a different suggestion for that track and some of the others: https://github.com/pombase/website/issues/767#issuecomment-508185063
Let us know what you think.
Hi @djeffares Did you see my questions above?
Excellent!
I have a few questions/comments on the track descriptions.
To clarify what "they are about":
There are 4 tracks:
- Hermes transposon insertion sites from multiple insertion libraries In this track each line represent an insertion site (and height reflects how many times it was observed in the cells in the libraries)
- HMM state generated from transposon insertion data Here the transposon insertion sites are smoothened to states S1-S5. The height of the scale bar corresponds to the state (0=S1, 2=S2, 3=S3...)
- Conservation estimated from alignment of four Schizosaccharomyces genomes using phyloP This track shows the conservation of S. pombe fitness consequence states S1-S5 in S. octosporus, S. japonicus, and S. cryophilus I'm really not sure how to interpret the track itself? Could you explain?
- HMM-derived elements (HDEs); genome windows with runs of one HMM state This track shows continuous runs of states S1-S3
this is what they look like in the browser:
Secondly, I'm not keen on having 3 new "data types" (transposon insertions, HMM state blocks, transposon HMM state) - can we fit these in under "transposon insertions" and tweak the track descriptions?
New track descriptions: Track 1: Transposon insertion sites (Grech 2019) Track 2: Transposon insertion sites smoothened to states S1-S5 (Grech 2019) Track 3: (needs elaborating so that it is obvious what it is showing) Conservation of S. pombe fitness consequence states S1-S5 in S. octosporus, S. japonicus, and S. cryophilus (Grech 2019) track 4: HDE units, continuous runs of states S1-S3 (Grech 2019)
I suggest specification of "assay type" (not great terminology for modelling.. perhaps "method" would be better?) as follows:
Track 1: Hermes Track 2: HMM Track 3: PhyloP Track 4: HMM
I also suggest to move the "sample ID" into "study ID" column ? (it looks like a study ID not a sample ID?)
Is it really applicable to both tracks 1 and 4 - should it only be added to track 1?
cheers! @djeffares
HI @Antonialock,
Sorry, I missed these questions. Answers below:
To clarify what "they are about":
Hermes transposon insertion sites from multiple insertion libraries In this track each line represent an insertion site (and height reflects how many times it was observed in the cells in the libraries) DJ: Yes, correct.
HMM state generated from transposon insertion data Here the transposon insertion sites are smoothened to states S1-S5. The height of the scale bar corresponds to the state (0=S1, 2=S2, 3=S3...) DJ: Yes, correct. But why is is that 0=S1? Surely S1 (state1) should have height=1 ?
Conservation estimated from alignment of four Schizosaccharomyces genomes using phyloP This track shows the conservation of S. pombe fitness consequence states S1-S5 in S. octosporus, S. japonicus, and S. cryophilus I'm really not sure how to interpret the track itself? Could you explain? DJ: No, this track is not generated from the Hermes transposon insertions. It is the conservation of each site over the phylogeny os the four Schizosaccharomyces species (S. pombe S. octosporus, S. japonicus, and S. cryophilus). Higher values mean more conservation (the scale is a negative logged P-value). The values were generated from an genome alignment, using the phyloP algorithm.
HMM-derived elements (HDEs); genome windows with runs of one HMM state This track shows continuous runs of states S1-S3 DJ: Yes.
Secondly, I'm not keen on having 3 new "data types" (transposon insertions, HMM state blocks, transposon HMM state) - can we fit these in under "transposon insertions" and tweak the track descriptions? DJ: Yes, this is fine.
New track descriptions: Track 1: Transposon insertion sites (Grech 2019) Track 2: Transposon insertion sites smoothened to states S1-S5 (Grech 2019) Track 3: (needs elaborating so that it is obvious what it is showing) Conservation of S. pombe fitness consequence states S1-S5 in S. octosporus, S. japonicus, and S. cryophilus (Grech 2019) track 4: HDE units, continuous runs of states S1-S3 (Grech 2019) DJ: Yes, but I think "smoothed" is simpler & in more common use.
I suggest specification of "assay type" (not great terminology for modelling.. perhaps "method" would be better?) as follows:
Track 1: Hermes Track 2: HMM Track 3: PhyloP Track 4: HMM
DJ: What about Transposon rather than Hermes, which is more generic.
I also suggest to move the "sample ID" into "study ID" column ? (it looks like a study ID not a sample ID?) Yes, fine.
Is it really applicable to both tracks 1 and 4 - should it only be added to track 1? DJ: Yes, fine.
cheers Dan
Thanks! I updated the descriptions:
I thought it was smoothened not smoothed - I blame the Drosophila researchers! :-)
"Hermes" is perfect in the assay desciption - in this field we want a detailed method type. In comparison "data type" is a higher level grouping term for different datasets (e.g. transcripts, chromatin binding sites...).
Let me know if you want anything tweaked, otherwise I'll announce tomorrow
The fitness landscape data is hosted and announced so I am clodsing this ticket..
We have some data from a manuscript that we’d like to have displayed on your beautiful new genome browser.
It is Hermes transposon insertion data, nucleosome data and HMM model data data (one state for each position in the genome, derived from the insertion data). Also conservation measures from a new alignment of new Schizosaccharomyces species genome assemblies.
So four tracks in all. It is displayed here at the moment: http://bahlerweb.cs.ucl.ac.uk/bioda/ http://bahlerweb.cs.ucl.ac.uk/bioda/ (best viewed with Forefox).
I have bigWig files at the moment, but would be happy for reformat if need be.