adamewing / tldr

Identify and annotate TE-mediated insertions in long-read sequence data
MIT License
40 stars 4 forks source link

callmeth.sh header #21

Closed Coracollar closed 2 years ago

Coracollar commented 2 years ago

Hi, I'm trying to calculate methylation frequency from the callmeth.sh output, however it seems to be a problem with the header even when the header looks okay.

my file looks like this: chromosome strand start end read_name log_lik_ratio log_lik_methylated log_lik_unmethylated num_calling_strands num_motifs sequence 9433d43d-aae9-4757-a9a8-7e980999e835 - 45 45 03de98f9-9155-475e-b80c-5952fa449927 3.32 -79.74 -83.06 1 1 AAATACGCCAG 9433d43d-aae9-4757-a9a8-7e980999e835 + 45 45 0563bcf3-41de-4f3a-bed3-bacf6990de67 4.65 -133.14 -137.79 1 1 AAATACGCCAG 9433d43d-aae9-4757-a9a8-7e980999e835 + 45 45 083364bd-748d-4ba4-8c7a-07918d67a595 5.85 -100.39 -106.24 1 1 AAATACGCCAG 9433d43d-aae9-4757-a9a8-7e980999e835 - 45 45 53a2038a-376f-4571-8ef3-12fcf593f773 -1.75 -126.39 -124.63 1 1 AAATACGCCAG 9433d43d-aae9-4757-a9a8-7e980999e835 - 45 45 6948bfe3-058d-4eaf-b3a2-2284eaabf892 6.49 -104.35 -110.84 1 1 AAATACGCCAG 9433d43d-aae9-4757-a9a8-7e980999e835 - 45 45 7203447c-fb52-4b39-b262-943bd2845e83 7.44 -139.72 -147.16 1 1 AAATACGCCAG 9433d43d-aae9-4757-a9a8-7e980999e835 + 45 45 81c21fe2-04e6-4802-9bc4-e93d19432f00 7.21 -108.18 -115.40 1 1 AAATACGCCAG 9433d43d-aae9-4757-a9a8-7e980999e835 - 45 45 94c79f9d-c701-41a2-be8a-04309b46a44b -1.02 -154.81 -153.79 1 1 AAATACGCCAG 9433d43d-aae9-4757-a9a8-7e980999e835 - 45 45 96124793-5040-47a4-a5e6-3eeebb880496 7.31 -115.83 -123.14 1 1 AAATACGCCAG 9433d43d-aae9-4757-a9a8-7e980999e835 + 45 45 af21be23-6ea9-4b93-8db5-532e9ed885fc 5.07 -168.89 -173.95 1 1 AAATACGCCAG 9433d43d-aae9-4757-a9a8-7e980999e835 - 45 45 bd9846ed-9973-488a-b346-2e5ac1cc52c3 2.98 -86.21 -89.19 1 1 AAATACGCCAG

and I have this error

f5c meth-freq -i C0_tldr.te.meth.tsv Incorrect header: chromosome strand start end read_name log_lik_ratio log_lik_methylated log_lik_unmethylated num_calling_strands num_motifs sequence

When using calculate_methylation_frequency.py I get an error in num_motifs, when again num_motifs does exist.

calculate_methylation_frequency.py C0_tldr.te.meth.tsv Traceback (most recent call last): File "/usr/local/easybuild-2019/easybuild/software/mpi/gcc/8.3.0/openmpi/3.1.4/nanopolish/0.13.2-python-3.7.4/scripts/calculate_methylation_frequency.py", line 41, in num_sites = int(record['num_motifs']) KeyError: 'num_motifs'

awk '{print $10}' C0_tldr.te.meth.tsv num_motifs 1 1 1 1

Any idea what might be happening?

adamewing commented 2 years ago

Hi, sorry for the delayed response. I'm following up to the point about "f5c": is that another program you're running? If so, is it meant to operate on nanopolish output?

Coracollar commented 2 years ago

No, the point was about the header, I was advised to change the header. Substituted the original header with: echo -n -e "chromosome\tstrand\tstart\tend\tread_name\tlog_lik_ratio\tlog_lik_methylated\tlog_lik_unmethylated\tnum_calling_strands\tnum_motifs\tsequence\n" > headerfile And it worked.

Thaks, Cora

On 13 Dec 2021, at 9:12 pm, adamewing @.***> wrote:

Hi, sorry for the delayed response. I'm following up to the point about "f5c": is that another program you're running? If so, is it meant to operate on nanopolish output?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/adamewing/tldr/issues/21#issuecomment-992848203, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANZ7PMCBR7ZBM7YSCZ26VPDUQZHSNANCNFSM5JB2WDZQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

adamewing commented 2 years ago

I think this was resolved