Open lpantano opened 6 years ago
I forgot to mention to click on watch on this https://github.com/miRTop/mirGFF3 so you know what is going on when the file is modified, please! and Like it with an star as well :)
Hi all,
Did anyone have a chance to go through this, I would love to have your inputs.
Thanks!
I think this is a great plan. Please let me know how I can better engage and help out. I might have a trainee who can do some analysis of the Tewari data, if you need help there.
Thanks @mhalushka , having more people would help. I am trying to get as well a trainee.
I'll wait until have a couple of more comments, and if everybody agrees, work on the draft for the GFF paper, so we have it for August.
Cheers
Hi Lorena, yes, I agree too that this is a great plan. Logo, independent repository to make this format independent of miRTOP, submission to file format databases, all that sounds great to me and seems like efficient new steps towards for a solid global format. Also, having an initial very short publication of the format before doing the more extensive study is likely a good move. This initial publication may attract more people to the group and this may lead to novel ideas which could improve the solid bases we already have. I really like the quote you wrote down from Tracy Teal ("If you want to go fast, go alone; If you want to go far, go together"), I think it summarize very well our situation and our common interest in making things happen as a group. I will try and get more data out soon and comment on the format definition . Please let me know if I can help on this short note you want to write or if I can help in any other way! I will also add a slide about the group in all my miRNA related talk from now on! Cheers,
Hi all, Lorena, thx for your continuous effort to move this forward. publishing is certainly always a good idea. maybe we can have a teleco next week to discuss the scope of this paper? right now i am not sure if a format only paper would be very short. it might be important to motivate the new format by discussing current methods and formats in order to make clear that this new format is something useful. you are thinking in publishing the conversion/downstream analysis tools separately or together with the tewari analysis? maybe the conversion tool(s) could go with the format paper and the downstream analysis (getting statistics out of the gff format) with the tewari data? Best, Mic
On 2 July 2018 at 21:08, Lorena Pantano notifications@github.com wrote:
@lpantano https://github.com/lpantano @gurgese https://github.com/gurgese @ThomasDesvignes https://github.com/ThomasDesvignes @mhalushka https://github.com/mhalushka @mlhack https://github.com/mlhack @keilbeck https://github.com/keilbeck @BastianFromm https://github.com/BastianFromm @ivlachos https://github.com/ivlachos @TJU-CMC https://github.com/TJU-CMC @sbb25 https://github.com/sbb25 @phillipeloher https://github.com/phillipeloher
Hi all,
It will be a little long email, but please take 15 min to go over. It will help to decide how we start spreading the word about this.
- BOSC was great, people got a lot of question and it was accepted with open hands.
- I got a lot of good ideas to help with the format and get people using it:
- make a logo, we are having the competition during this week, so we are almost there.
- create a separated repository with the format only, see here https://github.com/miRTop/mirGFF3
- submit to EDAM ontology and FAIRsharing, they are database that keep track of formats and databases, (I submitted to both), we are waiting to be reviewed. @BastianFromm https://github.com/BastianFromm maybe you want to submit mirGeneDB to FAIRSharing.org?
- publish a very small paper with the format only. This actually, I am in favor to do it. We can publish on F1000 https://f1000research.com. It is open and they allow very short papers. The main idea is to have something out soon so we get people aware, without a paper it is more difficult, and the current work with tewari data is great but it will need time. Can you tell me what do you think?
The deadline for important modifications to the first version of the format is in 1 week (07/08/2018). Just to be sure we spread the word with the very first usable format.
For that I need all of you to go to the definition https://github.com/miRTop/mirGFF3/blob/master/definition.md and open an issue with anything you think it is important to have and we don't have it. Anything, you would need to have if I want to develop something over this format, in term of query, visualization, re-mapping, anything you would need to know.
As final idea, all people recommend to try to present this as in many place as possible, but I cannot do it alone. So even if it is a slide in a talk, just do it. Having a paper will help. But you still can do this at any time. If you go to a conference and have a poster, as well, mention this to people, so we can create an ecosystem of mirna data analysis tools.
In summary:
- need your thoughts about publishing the format in F1000 (very short paper), leaving the python tool and tewari data for the next publication.
- need your feedback to make the format useful for everybody.
Thanks!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/miRTop/incubator/issues/21, or mute the thread https://github.com/notifications/unsubscribe-auth/AI2iM7enGtGfZ-8SfpkjcWNEzJvGp3OEks5uCm-VgaJpZM4U_zRs .
-- Michael Hackenberg Profesor Titular / Associate Professor Computational Epigenomics Lab (http://bioinfo2.ugr.es) Departamento de Genética Universidad de Granada
===============
"Este mensaje se dirige exclusivamente a su destinatario y puede contener información privilegiada o confidencial. Si no es Ud. el destinatario indicado, queda notificado de que la utilización, divulgación o copia sin autorización está prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, se ruega lo comunique inmediatamente por esta misma vía y proceda a su destrucción.
This message is intended exclusively for its addressee and may contain information that is CONFIDENTIAL and protected by professional privilege.
If you are not the intended recipient you are hereby notified that any dissemination, copy or disclosure of this communication is strictly prohibited by law. If this message has been received in error, please immediately notify us via e-mail and delete it".
================
I am happy to jump on a conference call. I'd like to figure out what to assign my trainee to work on.
Thanks Thomas for chiming in.
Thanks Michael for the input.
I proposed to leave the tool and tewari data for a future paper because the tool needs a lot of work if you want to have all the most important features added. For instance, querying the file is not implemented yet.
I think as well that working on the format for a publication will produce some changes for the better and that will affect the tool.
So my current plan is to publish the format, and we can make it long but then I need more contribution for that because I won’t be able to come with all the perspective. I think it would be great to have a good discussion for the paper, if everybody is on board. As well, there is actually no a wide used format, so that makes easier to promote the work we are doing.
If this plan goes ahead, I think mentioning that there is an open community developing the tool can bring more collaborators and make the tool better for the publication with the tewari data, plus we’d use the data to make the point the tool helps integrating big amount of samples and tools.
Here it is the doodle to try to make a conference call:
https://doodle.com/poll/ufinbin4fv772eee
Thanks all for ur feedback!
sorry, I added times to the pool. Remember, ET time. Thanks
Thanks for choosing your time:
I'll set up the meeting for Thursday 19, from 9-11am ET zone time. Some of you can at 9 and others at 10, so I'll be the two hours there and I can update Thomas who can only at 10am (sorry Thomas I totally forgot you are 3h behind me :( , we can meet even later that day if that is ok). I will send minutes at the end with the plan that we hopefully can agree on.
I'll send invitation on Monday!
Thanks!
The plan looks good to me and Thursday works great.
Hi all,
this is the invitation:
Topic: Mirtop - road plan 2018 Time: Jul 19, 2018 9:00 AM Eastern Time (US and Canada)
Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/221604941
Or iPhone one-tap : US: +16699006833,,221604941# or +16468769923,,221604941# Or Telephone: Dial(for higher quality, dial a number based on your current location): US: +1 669 900 6833 or +1 646 876 9923 Meeting ID: 221 604 941 International numbers available: https://zoom.us/u/eu3Ib7wO5
Thank you. I will be on the call. Also, the Tewari paper came out in Nat Biotech. So there is nothing holding us back from moving forward with their data now and getting our findings published. https://www.ncbi.nlm.nih.gov/pubmed/30010675
Minutes:
publications:
Marc will check we have all the data as it is published in GEO database
Philip will re-analyze the data with their tool
Hi all,
just to follow up with specific plan:
I set up a biweekly meeting to talk about the tewari data, everybody who can join is welcome. I know at the beginning it would be difficult to have a lot of people but if at some point you get use to have this day and time lock up, I hope it works. Here is the calendar of the miRTop project
You can add the meeting event using this link
The event will be every two weeks, every Thursday at 10am ET. I'll send a reminder one day before.
As a final item, I'll start the definition format paper and share a google docs with you all. I'll try to setup some deadlines that will be contained in the document itself.
Thanks all for keep pushing!
Information to join the tewari meeting (I added to the calendar as well):
Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/553765969
Or iPhone one-tap : US: +16465588665,,553765969# or +14086380986,,553765969# Or Telephone: Dial(for higher quality, dial a number based on your current location): US: +1 646 558 8665 or +1 408 638 0986 Meeting ID: 553 765 969 International numbers available: https://zoom.us/u/b0MgRck4D
Hi all,
All the minutes will be kept here: https://github.com/miRTop/incubator/blob/master/projects/tewari/minutes.md, feel free to bookmark this to catch up.
As well, feel free to check the road map for the project:
https://github.com/miRTop/incubator/projects/2
Thanks!
I am away on the 30th, but hopefully Arun can represent us at the talk.
Sure Marc. I will be attened the meeting on August 30. Thanks Lorena for the update.
Thank you, Arun.
On Mon, Aug 20, 2018 at 9:20 PM, Marc Halushka notifications@github.com wrote:
I am away on the 30th, but hopefully Arun can represent us at the talk.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/miRTop/incubator/issues/21#issuecomment-414366552, or mute the thread https://github.com/notifications/unsubscribe-auth/AJilY8JvSXa5Z64NvbqrtevEXg2wLh12ks5uStrdgaJpZM4U_zRs .
-- Arun H Patil Ph.D. Student Institute of Bioinformatics Discoverer Building, 7th Floor International Tech Park Whitefield, Bangalore - 560 066 Karnataka, India. Mobile:(+91) 9964121551 Phone:(+91) 80-28416140 Fax: (+91) 80-28416132
Hi,
Here are the minutes from yesterday: https://github.com/miRTop/incubator/blob/master/projects/tewari/minutes.md#08-30-2018
Next meeting will be Sept, 20
Hi all,
It would be good to get as many of you as possible for tomorrow meeting. We have spotted an important results that would need a good discussion to know how to move forward.
As well, I will share the draft for the mirtop format paper so you can contribute and make modifications. The idea is to submit to F1000 at the end of October.
I hope some of you can make it.
Cheers
I will be on the call. Looking forward to it.
Very interested to join. What is the scheduled time?
Dr. Bastian Fromm (PhD)
Senior Researcher
Friedländer group Science for Life Laboratory Department of Molecular Biosciences The Wenner-Gren Institute Stockholm University S-10691 Stockholm Sweden
cell: +47 94 12 29 55 eMail: bastianfromm@gmail.com DB: http://www.mirgenedb.org/
profiles Linkedin https://www.linkedin.com/in/bastian-fromm-90286843/ Loop http://loop.frontiersin.org/people/399448/overview Mendeley https://www.mendeley.com/profiles/bastian-fromm/ ORCID http://orcid.org/0000-0003-0352-3037 Publons https://publons.com/author/306130/bastian-fromm#profile Researchgate https://www.researchgate.net/profile/Bastian_Fromm2 Scopus https://www.scopus.com/authid/detail.uri?authorId=25030115200 Twitter https://twitter.com/@BastianFromm Xing https://www.xing.com/profile/Bastian_Fromm?sc_o=mxb_p
On Wed, Sep 19, 2018 at 7:37 PM Marc Halushka notifications@github.com wrote:
I will be on the call. Looking forward to it.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/miRTop/incubator/issues/21#issuecomment-422892911, or mute the thread https://github.com/notifications/unsubscribe-auth/AaAi3-ykbvgRM9GgonY3V6trjQ_5ymjXks5ucoDAgaJpZM4U_zRs .
Thanks Lorena,
I am alone with the kids as my wife is travelling for work so am picking them up by then. Cannot join the meeting unfortunately, if it was an hour earlier I could have.
Would love to contribute thou. Please let me know if I can do/ comment anything.
Cheers,
Bastian
On Thu, Sep 20, 2018, 03:23 Lorena Pantano notifications@github.com wrote:
Bastian,
the time is 10am Boston time and the link to connect is https://zoom.us/j/553765969
cheers
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/miRTop/incubator/issues/21#issuecomment-423008759, or mute the thread https://github.com/notifications/unsubscribe-auth/AaAi35yKdGOPCxlN0k8Idk1oHGf1C8ODks5ucu34gaJpZM4U_zRs .
Hi Bastian,
Sorry about having a bad time. Maybe next time is possible for you to join. We are having that time so people from the other coast can join.
I have updated the minutes here:
Sorry, I sent it by mistake before was ready.
The minutes are here:
https://github.com/miRTop/incubator/blob/master/projects/tewari/minutes.md#09-20-2018
I played more with the data and I think we see something now that may make sense. Botton line is that filtering the data to look at the miRNAs where the reference is the top expressed, the majority of isomiRs happen to be with an abundance lower than 20% of the total miRNA. More in the link you'll see inside the minutes.
You can see there the next steps.
@mhalushka is ok if I share with you the raw data and you analyze it with miRGe?
Next meeting is on October 4th, at 10 am Boston time. (https://zoom.us/j/553765969)
See you there!
@lpantano. Yes. I'm happy to run the data through miRge.
Hi all,
these are the minutes: https://github.com/miRTop/incubator/blob/master/projects/tewari/minutes.md#10-04-2018
@mhalushka , we came up with some questions for the authors. Do you think you can try to contact them? happy to clarify over email.
Next meeting is on October 18th, at 10 am Boston time. (https://zoom.us/j/553765969)
cheers
Hi all,
We discussed today mainly about the paper. There are the followed comments that we want to normalize and if no body has a strong opinion we'll go ahead with that:
The minutes are here: https://github.com/miRTop/incubator/blob/master/projects/tewari/minutes.md#10-18-2018
Next meeting is on November 1st, at 10 am Boston time. (https://zoom.us/j/553765969)
cheers
Hi all, sorry to the late reply.
This totally makes sense to me. It just depends on where the focus is put and seems just like a convention to decide upon. It's just two ways to look at an isomiR and it could be the opposite if we focus more on isomiR sequence itself and its the length (a isomiR with 1 extra nucleotide would be "+1" because it's one longer) vs the start of the alignment (a isomiR with 1 extra nucleotide would be "-1" because its starts one nucleotide earlier).
And I think whichever system is chosen, the important thing is to be very clear in the definition. As far as I know there is not really anything similar for protein coding genes and mRNA transcript isoform are referred to as "delta exon X" or things like that, but there's no real consensus I'm aware of. For myself, I'm totally fine with this convention or the other, as far as we define it without ambiguity.
Cheers, Thomas
Just for clarity (and I apologize if this was already covered), but for the ref miRNA start position, is this the alignment in miRBase or miRGeneDB? I think there may be a very rare miRNA that differs between the two sites and I don't recall if that is settled or not. It may be that all of the differences between miRBase and miRGeneDB are on the 3' end. I think @BastianFromm may be able to clarify that second point.
Hi!
Interesting discussion.
I really think Marc's point is very important here regardless of what is included as microRNA. But do we even agree that a microRNA has to have a reference annotation somewhere?
If so it would have to be either miRBase or MirGeneDB and from what we did as a comparison for the Annual Reviews and for the new paper, I know that it is unfortunately not only 3p ends but also 5p ends, simply incorrect coordinates (despite correct sequence annotations) f and completely missing arm annotations.
From the Annual Reviews:
"The third and final key distinction between miRBase and MirGeneDB is that not only are all mature and star sequences carefully curated, they are also remapped to the latest genome assemblies. When the Gene Expression Omnibus data (10 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4743252/#R10) provided by miRBase for these entries (Figure 3a https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4743252/figure/F3/?report=objectonly) are compared with current genome coordinates (miRBase 21; GRCh38), we find that 69.5% of the annotated reads [should be arms] from the 523 accepted human miRNA precursors (Supplemental Table 1 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4743252/#SD1) have missing or incorrect annotations (Supplemental Table 6 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4743252/#SD1). Specifically, 105 expressed sequences (10%) lack annotation all together; 214 (20.5%) have correct sequences but incorrect genome coordinates (e.g., Let-7-P1, Supplemental Table 6 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4743252/#SD1); and 406 (39%) have genome coordinates whose length and/or start position differs between 1 and 8 nucleotides from the updated annotation (Supplemental Table 6 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4743252/#SD1). Indeed, of these 406 sequences, 115 are offset at the 5′ end with respect to the miRBase entries."
And this is much worse of course in other organisms.
Best, Bastian
Marc Halushka notifications@github.com schrieb am Mo., 22. Okt. 2018, 21:33:
Just for clarity (and I apologize if this was already covered), but for the ref miRNA start position, is this the alignment in miRBase or miRGeneDB? I think there may be a very rare miRNA that differs between the two sites and I don't recall if that is settled or not. It may be that all of the differences between miRBase and miRGeneDB are on the 3' end. I think @BastianFromm https://github.com/BastianFromm may be able to clarify that second point.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/miRTop/incubator/issues/21#issuecomment-431951040, or mute the thread https://github.com/notifications/unsubscribe-auth/AaAi3wIAimmIvVtvcNtxpSQsFS0Ew6e4ks5unh2MgaJpZM4U_zRs .
Hi all,
thanks for the comments, very helpful. I would then update the definition to be consistent with @phillipeloher proposal.
I agree that each database would have its reference, and mirgenedb is trying to improve this, but I am very grateful for all that work. For this same reason, I think the idea is to have the file format be database dependent, what can facilitate for instance to translate mirbase to mirgenedb easily. Or in case people are working with some new species, they can still use the format.
Hopefully, this is a tool that can show even more the issues of some databases, and find a consensus in the future.
Thanks!
Great feedback, we'll work on updating the definitions today and tomorrow.
Hi,
the minutes from yesterday meeting, pay attention to the paper @phillipeloher found, they have a section of isomiRs, although we are going further we overlap a little:
People: Lorena, Phillipe, Ioannis, Arun, Marc
We discuss:
Update paper according discussed points.
Next meeting 11-15-2018, 10am EST time, 4pm GMT+1 time.
Hi all,
I hope you get a good holidays (for the ones are in the USA).
sorry to miss the minutes from the last meeting.
We discussed two topics:
experimental design to evaluate whether false isomirs are sequencing artifacts or not (please ask to be added to https://docs.google.com/document/d/1XyKjQJ2R6qdES12uDK-5ffA43xgEHmcZaZ_6Vr6yWFM/edit?usp=sharing). I put there the hypothesis and ideally we get some proposal of experiments that we can do to prove this (I added mine).
I am going to put together everything we have until now in a presentation and paper format and share with you.
Since there is a lot on my plate, I am canceling this week meeting (No meeting on the 29th) and have it on the 13th to decide about the experimental design explained in the previous point.
I will give a local talk on the 6th, that I will stream so you are welcome to join, I will post the link here next week.
Next meeting 12-13-2018, 10am EST time, 4pm GMT+1 time.
Thanks!
Hi all,
I am giving the talk tomorrow that will summarized the mirtop and mirGFF3 project but as well the re0-analysis of the tewari and other data set. You can join online (tomorrow at 11 am Boston time):
Meeting URL https://bluejeans.com/211525664
Meeting ID 211 525 664
Moderator Passcode 2730
Want to dial in from a phone? Dial one of the following numbers: +1.408.740.7256 (US (San Jose)) +1.888.240.2560 (US Toll Free) +1.408.317.9253 (US (Primary, San Jose)) see all numbers Enter the meeting ID and passcode followed by #
Connecting from a room system? Dial: 199.48.152.152 or bjn.vc and enter your meeting ID & passcode
Hi all,
I recorded the talk summarizing the work we have done with the re-analysis, you can access it through here: https://bluejeans.com/s/h4CR2
Next meeting 12-13-2018, 10am EST time, 4pm GMT+1 time. (https://zoom.us/j/553765969)
Here is the document to add the experimental design that can help to study the isomiR accuracy in sequencing: https://docs.google.com/document/d/1XyKjQJ2R6qdES12uDK-5ffA43xgEHmcZaZ_6Vr6yWFM/edit?usp=sharing
Talk to you soon.
cc @carriewright11
Hi everybody!
Happy holidays.
Next meeting 01-17-2019, 10am EST time, 4pm GMT+1 time. (https://zoom.us/j/553765969)
Enjoy the break.
Cheers
I am archiving this thread to open the Road plan for 2019 soon.
@lpantano @gurgese @ThomasDesvignes @mhalushka @mlhack @keilbeck @BastianFromm @ivlachos @TJU-CMC @sbb25 @phillipeloher
Hi all,
It will be a little long email, but please take 15 min to go over. It will help to decide how we start spreading the word about this.
The deadline for important modifications to the first version of the format is in 1 week (07/08/2018). Just to be sure we spread the word with the very first usable format.
For that I need all of you to go to the definition and open an issue with anything you think it is important to have and we don't have it. Anything, you would need to have if I want to develop something over this format, in term of query, visualization, re-mapping, anything you would need to know.
As final idea, all people recommend to try to present this as in many place as possible, but I cannot do it alone. So even if it is a slide in a talk, just do it. Having a paper will help. But you still can do this at any time. If you go to a conference and have a poster, as well, mention this to people, so we can create an ecosystem of mirna data analysis tools.
In summary:
Thanks!
18