seb-mueller / chlamy_locus_map

Small RNA Locus Map for Chlamydomonas reinhardtii
GNU General Public License v3.0
1 stars 0 forks source link

Drafting article #17

Open nmatthews323 opened 5 years ago

nmatthews323 commented 5 years ago

Hi Seb,

Hope you had a nice weekend. I've redrafted some sections of the manuscript.

The main things I've done is:

Can you take a look and see what you think? The general sRNAs in Chlamy vs other organisms is the key bit of discussion we need. Then we can send to David and Adrian. Also, I have three summary tables generated in csv format, you mentioned you knew how to make nice tex tables from these?

Around for a call this afternoon if you like.

nmatthews323 commented 5 years ago

Hey Seb,

How are you? Hope all is well with the (not so) newborn!

Have you managed to take a look at this at all? Or know when you'll get around to it? I'm currently planning out my time-commitments for April/May/June so would be good to slot this in where appropriate. We talked about trying to have a draft done by the end of April?

I fly back to the UK from California tonight. Can be around for a call Wednesday late afternoon, Thursday morning, or Friday until about 2pm.

seb-mueller commented 5 years ago

Hi Nick,

Sorry for the delayed reply.. This time teaching and training and supervisons have taken it's toll. I was planing to work in it this week, so having something done after the easter break. I'll keep you up to date. Hope you had a good return form the US?

nmatthews323 commented 5 years ago

Hi Seb, no problem at all! Plenty going on my end to get stuck into... Let me know how you get on looking at it and I'll then try and allocate some of my time to revisit it.

Ended up flying straight off to Nice for a course after my return to the UK, so now just about getting settled again in Manchester (though currently on the train to Brighton for a workshop...). How are things with you? Getting much sleep!?

nmatthews323 commented 5 years ago

Hi Seb, hope all is well! I just wanted to check-in with regards to this paper. I've been pretty packed out with PhD work for the past couple months but I might have a bit of time at the start of July to finish this off. Are you going to have any time to put into this? It's pretty close I think but needs a last slog of writing and pulling together.

I know things must be super hectic with you at the moment and I'd hate for this work to interfere with you spending time with your kid. Let me know your thoughts?

seb-mueller commented 4 years ago

Hi Nick! First of all very very sorry for only getting back to you now! Indeed I was struggeling to make ends meet until September since I was on parental leave and my daughter had a condition that took up most of my time. It is mostly fine now and I'm back to work also. Nevertheless I feel bad not having updated you. So apologies, hope I can make it up again!

Anyway, to try to make up, I've got back on the paper and try to make some progess! I'll be updating here as I produce plots and work on the paper etc.

Anyway, hope you are doing well and enjoyed the summer!

seb-mueller commented 4 years ago

Having had a go at your comment on the paper about starting the result section, e.g. characterizing sRNas in Clamy as well as contrasting to higher plants, I've put some code together and came up with this figure:

sRNA_size_rep_firstNuc_chlamy_vs_arabidopsis

The code is in here: https://github.com/seb-mueller/chlamy_locus_map/blob/master/Scripts/Segmentation_Analysis_post_selection.R

I've only included WT libraries for both Species. What do you think of this? Can you think of any addition to this?

seb-mueller commented 4 years ago

Got even a new version including the average abundance of sRNA species. This reveals why there is such a peak at redundant sRNA at 17nt. It's mostly due to highly expressed sRNAs:

sRNA_size_rep_firstNuc_chlamy_vs_arabidopsis

nmatthews323 commented 4 years ago

Hi Seb, no worries at all, hope all is going well?

These figures look really great, thanks! Interesting on the 17nt, that didn't show up in the original project, but some biases for smaller length nucleotides did show up in the clustering last time and this time (see clusters 4 and 6 - https://www.overleaf.com/17574001mfpjmkhghhtc). Do we know if there's a specific library or subset of libraries driving it or whether it's the same pattern when all libraries are included? I know this script takes a while to run due to loading the raw reads in...

More generally I need to catch myself back-up on where we're at and what still needs doing. Do you have any time on Friday for a quick chat?

Currently, I'm doing an internship in Westminster with the government office for science, as it happens I'm doing a project looking at opportunities with next-generation high-performance computing, so my limited cluster experience is coming in handy! Will be back to PhD from 22nd October, should hopefully still be able to allocate a bit of time to this as it would be good to get it out.

seb-mueller commented 4 years ago

All fine so far! Mostly settled in by now and things are mostly back to normal :)

The 17bp peak is curious indeed! I've only included WT libraries (which we annotated as "wt" in the Control column of Summary_of_Data.csv. However you a right, it's only a subset. Having looked at a few, the is split into data sets with massive peak at 17bp (e.g. SL2301-03), and some without e.g. 2310-27).

Since Adrian has prepared some of the libraries, I'll ask him about it, but I'm tempted to take the weird ones out since there are only a few which however skew the overall picture (and mention it in the methods).

Friday works for me well. I'll be either at the office (in the morning) or at the climate demo here in Cambs (from 4pm probably). Just call me up whenever! I'll probably try to go on with the paper in the meantime trying to fill in the gaps and getting the data into the repositories (which always takes quite some time).

Also really impressive making it to Westminster, an internship might be a food in the door, curious to hear more on Friday!

nmatthews323 commented 4 years ago

Great, I'll call you late morning then I think.

On Wed, 18 Sep 2019, 10:49 pm seb-mueller, notifications@github.com wrote:

All fine so far! Mostly settled in by now and things are mostly back to normal :)

The 17bp peak is curious indeed! I've only included WT libraries (which we annotated as "wt" in the Control column of Summary_of_Data.csv. However you a right, it's only a subset. Having looked at a few, the is split into data sets with massive peak at 17bp (e.g. SL2301-03), and some without e.g. 2310-27).

Since Adrian has prepared some of the libraries, I'll ask him about it, but I'm tempted to take the weird ones out since there are only a few which however skew the overall picture (and mention it in the methods).

Friday works for me well. I'll be either at the office (in the morning) or at the climate demo here in Cambs (from 4pm probably). Just call me up whenever! I'll probably try to go on with the paper in the meantime trying to fill in the gaps and getting the data into the repositories (which always takes quite some time).

Also really impressive making it to Westminster, an internship might be a food in the door, curious to hear more on Friday!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/seb-mueller/chlamy_locus_map/issues/17?email_source=notifications&email_token=AJF5LB5HMECRAIATULYSNX3QKKO7LA5CNFSM4GYDKJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7BR3RI#issuecomment-532880837, or mute the thread https://github.com/notifications/unsubscribe-auth/AJF5LB4JFORGGXZCBKZQZW3QKKO7LANCNFSM4GYDKJ5A .

seb-mueller commented 4 years ago

Just had another look in the litarture, Zhao 2007 didn't have the 17 peak. So I'll remove the the weird libraries for the time being:

https://www.ncbi.nlm.nih.gov/core/lw/2.0/html/tileshop_pmc/tileshop_pmc_inline.html?title=Click%20on%20image%20to%20zoom&p=PMC3&id=1865491_1190fig1.jpg image

nmatthews323 commented 4 years ago

Does this mean re-running the segmentation and clustering?

On Thu, 19 Sep 2019, 11:53 am seb-mueller, notifications@github.com wrote:

Just had another look in the litarture, Zhao 2007 didn't have the 17 peak. So I'll remove the the weird libraries for the time being:

https://www.ncbi.nlm.nih.gov/core/lw/2.0/html/tileshop_pmc/tileshop_pmc_inline.html?title=Click%20on%20image%20to%20zoom&p=PMC3&id=1865491_1190fig1.jpg [image: image] https://user-images.githubusercontent.com/20091739/65238061-cc2d8280-dad3-11e9-827d-48cd17234cfb.png

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/seb-mueller/chlamy_locus_map/issues/17?email_source=notifications&email_token=AJF5LB2LUUJGAES7X62FV3LQKNK2JA5CNFSM4GYDKJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7DBWMA#issuecomment-533076784, or mute the thread https://github.com/notifications/unsubscribe-auth/AJF5LB3TARCY5FD55KQPPYDQKNK2JANCNFSM4GYDKJ5A .

seb-mueller commented 4 years ago

I think the clustering and segmentation should be fine. This will most likely be one 17bp sRNA species which is massively overrepresented and would only affect a few loci (maybe even only 1). Those loci would also still be valid. They just are sometimes overexpressed and sometimes not.

I'll just take them out for the sRNA abundance characterization, which is fine as long as we state it. So no worries :)

nmatthews323 commented 4 years ago

Ok, great, that's a relief!

On Thu, 19 Sep 2019, 12:01 pm seb-mueller, notifications@github.com wrote:

I think the clustering and segmentation should be fine. This will most likely be one 17bp sRNA species which is massively overrepresented and would only affect a few loci (maybe even only 1). Those loci would also still be valid. They just are sometimes overexpressed and sometimes not.

I'll just take them out for the sRNA abundance characterization, which is fine as long as we state it. So no worries :)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/seb-mueller/chlamy_locus_map/issues/17?email_source=notifications&email_token=AJF5LB3NFPY64C6S6A2FBSLQKNLY5A5CNFSM4GYDKJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7DCSFQ#issuecomment-533080342, or mute the thread https://github.com/notifications/unsubscribe-auth/AJF5LB5QNVLKGQPXECLNNBLQKNLY5ANCNFSM4GYDKJ5A .

seb-mueller commented 4 years ago

I think I found the culprit! It seems a single sRNA: TTAGTGACGCGCATGAA Apprently mapping to a 26S ribosomal RNA. It's occurring on 6 places and varies quite a lot in abundance (in line with what I found above):

 Slot "alignments":¬                                                                                                                                                                                                                      
   GRanges object with 6 ranges and 2 metadata columns:¬                                                                                                                                                                                    
              seqnames          ranges strand |               tag multireads¬                                                                                                                                                               
                 <Rle>       <IRanges>  <Rle> |       <character>  <numeric>¬                                                                                                                                                               
     [1]  chromosome_1     17664-17680      - | TTAGTGACGCGCATGAA          6¬                                                                                                                                                               
     [2]  chromosome_1 2917272-2917288      + | TTAGTGACGCGCATGAA          6¬                                                                                                                                                               
     [3]  chromosome_8 5032465-5032481      + | TTAGTGACGCGCATGAA          6¬                                                                                                                                                               
     [4]  chromosome_9 2352406-2352422      - | TTAGTGACGCGCATGAA          6¬                                                                                                                                                               
     [5] chromosome_14 4149992-4150008      + | TTAGTGACGCGCATGAA          6¬                                                                                                                                                               
     [6] chromosome_14 4157494-4157510      + | TTAGTGACGCGCATGAA          6¬                                                                                                                                                               
     -------¬                                                                                                                                                                                                                               
     seqinfo: 54 sequences from an unspecified genome; no seqlengths¬                                                                                                                                                                       
   ¬                                                                                                                                                                                                                                        
   Slot "data":¬                                                                                                                                                                                                                            
   Matrix with  6  rows.¬                                                                                                                                                                                                                   
        SL2108 SL2121 SL2122 SL2123 SL2124 SL2125 SL2181 SL2182 SL2183 SL2184 SL2185 SL2186 SL2187 SL2188 SL2189  SL2301  SL2302¬                                                                                                           
   [1,]    460    330    206    132     90     73 479485 478175 542858 635830 601947 836510 290791 429457 392290 1448183 1042020¬                                                                                                           
   [2,]    460    330    206    132     90     73 479485 478175 542858 635830 601947 836510 290791 429457 392290 1448183 1042020¬                                                                                                           
   [3,]    460    330    206    132     90     73 479485 478175 542858 635830 601947 836510 290791 429457 392290 1448183 1042020¬                                                                                                           
   [4,]    460    330    206    132     90     73 479485 478175 542858 635830 601947 836510 290791 429457 392290 1448183 1042020¬                                                                                                           
   [5,]    460    330    206    132     90     73 479485 478175 542858 635830 601947 836510 290791 429457 392290 1448183 1042020¬                                                                                                           
   [6,]    460    330    206    132     90     73 479485 478175 542858 635830 601947 836510 290791 429457 392290 1448183 1042020¬                                                                                                           
         SL2303 SL2310 SL2311 SL2312 SL2313 SL2314 SL2315 SL2322 SL2323 SL2324 SL2325 SL2326 SL2327¬                                                                                                                                        
   [1,] 1397718    103    111    140    196     99     60     74    102    256    125     17     54¬                                                                                                                                        
   [2,] 1397718    103    111    140    196     99     60     74    102    256    125     17     54¬                                                                                                                                        
   [3,] 1397718    103    111    140    196     99     60     74    102    256    125     17     54¬                                                                                                                                        
   [4,] 1397718    103    111    140    196     99     60     74    102    256    125     17     54¬                                                                                                                                        
   [5,] 1397718    103    111    140    196     99     60     74    102    256    125     17     54¬                                                                                                                                        
   [6,] 1397718    103    111    140    196     99     60     74    102    256    125     17     54¬                                                                                                                                        
   ¬                                                                                                                                                                                                                                        
   Slot "libsizes":¬                                                                                                                                                                                                                        
    [1] 103552  94019  69880  38635  33629  33523  32321  24230  28473  25244  26158  38674  23871  23962  23475  76650  50659  65313¬                                                                                                      
   [19] 196369 202368 189698 169294 168801 149574 169330 167851 236223 168296 126186  93824¬

I think I'll just exclude this species and mention this in the text (plus ask Adrian about it). Sorry for being so verbose, those are also mostly notes to myself ;)

seb-mueller commented 4 years ago

Hi Nick!

I've just finished the bulk what we discussed on the phone just in time till mid October (if I remember correctly, you said you might have some time to have a look).

Even if you don't have much time, could you have a quick look at the general structure and plots? I'd like to pass it on to Adrian as soon as possible since I was doing the data submission which turns out to be complicated and I need his help, but wanted also share the newest version at the same time since some of it is needed to fill in the database?

If you have a bit more time, the main thing left is the discussion and conclusion. Maybe you could use your old version and tweak it? It doesn't have to be perfect since we are submitting to bioarchiv first?

Thanks!

Quick list of things I changed:

Any input of the above is welcome, let me know if you agree on the major changes (you can see changes using overleaf history tab and to the button "compare to another version" or so)

nmatthews323 commented 4 years ago

Hi Seb,

Great! Thanks so much for your hard work on this.

I am quite swamped this week as it's the last week of my internship, I was planning on working on the discussion next Wednesday (23rd October) - not sure if that's too late for your plan?

Let me know! As I might have some time to do stuff before then and will definitely have a first look through the stuff you've done today or tomorrow.

On Wed, 16 Oct 2019, 11:22 am seb-mueller, notifications@github.com wrote:

Hi Nick!

I've just finished the bulk what we discussed on the phone just in time till mid October (if I remember correctly, you said you might have some time to have a look).

Even if you don't have much time, could you have a quick look at the general structure and plots? I'd like to pass it on to Adrian as soon as possible since I was doing the data submission which turns out to be complicated and I need his help, but wanted also share the newest version at the same time since some of it is needed to fill in the database?

If you have a bit more time, the main thing left is the discussion and conclusion. Maybe you could use your old version and tweak it? It doesn't have to be perfect since we are submitting to bioarchiv first?

Thanks! Quick list of things I changed:

  • added browser shots
  • overhauled result section (especially sections before MCA)
  • updated/rearranged/added figures, and merged them together
  • added tables and named them plus referenced them in text
  • submitted raw data to database
  • added to the method section (not quite complete yet, but almost there)

Any input of the above is welcome, let me know if you agree on the major changes (you can see changes using overleaf history tab and to the button "compare to another version" or so)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/seb-mueller/chlamy_locus_map/issues/17?email_source=notifications&email_token=AJF5LBYJGLZK4FYMDSNBGXDQO3TOXA5CNFSM4GYDKJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBL7DEY#issuecomment-542634387, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJF5LB42DPJO4POUSFMUUJTQO3TOXANCNFSM4GYDKJ5A .

seb-mueller commented 4 years ago

No worries, it's not that urgent, 23rd is fine! Even if you only manage to skim it by then, let me know what you think. Just wanted to know send weird stuff to Adrian :) Thanks!

nmatthews323 commented 4 years ago

Hi Seb!

Had a look through, looking really good!! Thanks so much for all the work you've put in. Spotted a small number of things, some small, some things we probably need to discuss. Planning to get onto writing discussion this week!

  1. Line 126: Why is this not a single % - the total % of redundant reads that come from miRNAs? It's not quite clear what 0.5% and 9% refer to at the moment.
  2. Line 133: I've reworded this, I think the notable point at 21nt is the more pronounced A and T enrichment - see what you think
  3. Methods for sRNA size classes - previous graphs of sizes showed 20 and 21nt being most abundant, but now 20nt isn't so present, however we still have it in the methods as saying so, this therefore reads a bit confusingly but not sure what to do about it. We also need the size ratio density plots in there somewhere.
  4. I've commented out the methylation methods as we dumped this
  5. General issue, we're not describing results in the same tense - "there was a predominance for uracil" or "there is a predominance for uracil". I think I do the former and you do the latter. Don't really mind which way round we do but we need to do the same!

All the best!

On Thu, Oct 17, 2019 at 3:15 PM seb-mueller notifications@github.com wrote:

No worries, it's not that urgent, 23rd is fine! Even if you only manage to skim it by then, let me know what you think. Just wanted to know send weird stuff to Adrian :) Thanks!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/seb-mueller/chlamy_locus_map/issues/17?email_source=notifications&email_token=AJF5LB6JEVAS3UC54L2JPYLQPBXRXA5CNFSM4GYDKJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBQIB2A#issuecomment-543195368, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJF5LBYLFKHPYURFEI3UCDDQPBXRXANCNFSM4GYDKJ5A .

nmatthews323 commented 4 years ago

Hi Seb,

I've added in a short discussion section. I'd like to have another go through the whole paper and tidy up the wording etc. but if you've already sent it to Adrian I might wait until we hear back from him to do further edits? What do you think?

I also re-added the methylation methods, as I remembered that we didn't dump them.

Have plenty of time for a call next couple of days if you'd like.

All the best,

Nick

On Sun, Oct 20, 2019 at 1:08 PM Nick Matthews nmatthews323@gmail.com wrote:

Hi Seb!

Had a look through, looking really good!! Thanks so much for all the work you've put in. Spotted a small number of things, some small, some things we probably need to discuss. Planning to get onto writing discussion this week!

  1. Line 126: Why is this not a single % - the total % of redundant reads that come from miRNAs? It's not quite clear what 0.5% and 9% refer to at the moment.
  2. Line 133: I've reworded this, I think the notable point at 21nt is the more pronounced A and T enrichment - see what you think
  3. Methods for sRNA size classes - previous graphs of sizes showed 20 and 21nt being most abundant, but now 20nt isn't so present, however we still have it in the methods as saying so, this therefore reads a bit confusingly but not sure what to do about it. We also need the size ratio density plots in there somewhere.
  4. I've commented out the methylation methods as we dumped this
  5. General issue, we're not describing results in the same tense - "there was a predominance for uracil" or "there is a predominance for uracil". I think I do the former and you do the latter. Don't really mind which way round we do but we need to do the same!

All the best!

On Thu, Oct 17, 2019 at 3:15 PM seb-mueller notifications@github.com wrote:

No worries, it's not that urgent, 23rd is fine! Even if you only manage to skim it by then, let me know what you think. Just wanted to know send weird stuff to Adrian :) Thanks!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/seb-mueller/chlamy_locus_map/issues/17?email_source=notifications&email_token=AJF5LB6JEVAS3UC54L2JPYLQPBXRXA5CNFSM4GYDKJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBQIB2A#issuecomment-543195368, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJF5LBYLFKHPYURFEI3UCDDQPBXRXANCNFSM4GYDKJ5A .

seb-mueller commented 4 years ago

Brilliant! Thanks so much. I'll have a look tomorrow or Friday and send it then off to Adrian (I haven't so far)! Maybe we can have a chat next Thursday? I'm probably quite busy preparing for a webinar I'm holding on Wednesday (https://www.dolomite-bio.com/webinar-dropseqpipe-nadia-endless-possibilities-droplet-data/).

nmatthews323 commented 4 years ago

Sounds great, thanks, and webinar looks interesting! Chat next Thursday sounds good, hope the webinar goes well.

Nick

On Wed, 23 Oct 2019, 8:56 pm seb-mueller, notifications@github.com wrote:

Brilliant! Thanks so much. I'll have a look tomorrow or Friday and send it then off to Adrian (I haven't so far)! Maybe we can have a chat next Thursday? I'm probably quite busy preparing for a webinar I'm holding on Wednesday ( https://www.dolomite-bio.com/webinar-dropseqpipe-nadia-endless-possibilities-droplet-data/ ).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/seb-mueller/chlamy_locus_map/issues/17?email_source=notifications&email_token=AJF5LB2OJBH5FPZ6JXXWT3LQQCUATA5CNFSM4GYDKJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECCVVKI#issuecomment-545610409, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJF5LB3S5MTFNUS3AHNXXY3QQCUATANCNFSM4GYDKJ5A .