bergmanlab / mcclintock

Meta-pipeline to identify transposable element insertions using next generation sequencing data
93 stars 30 forks source link

result files empty or not generated - ngs_te_mapper2 and popoolationTE2 #84

Closed OctavioSerra closed 2 years ago

OctavioSerra commented 3 years ago

Hi,

First I would like to thank you for developing this tool and make life easier in many ways for TE analysis.

I am running mcclintock with methods ngs_te_mapper2, popoolationTE2, and retroseq.

When the run finished for the first sample, I went to summary directory and looked into the html summary, got this:

image

I found it a little wierd that ngs_te_mapper2 gave 0 results. I looked into the results directory of ngs_te_mapper2 and indeed the BED files were created but are empty. I then investigated its log file and did not find anything suspicious, only this mentioning at the 7th line.

image

Then I went to look into popoolationTE2 results in detail, but the directory of results was empty. How is it possible to have no result files, when the summary html shows results for popoolationTE2?

When inspecting popoolationTE2 log file, I saw these error messages:

image

In detail, i got 2,574,324 "Mapped mate should have mate reference name" and 600,135 "Read CIGAR M operator maps off end of reference". I dont know if these errors are related in some extent with the absence of result files for popoolationTE2 or not.

Retroseq ran smoothly and generated the expected BED files.

All installation steps and test runs were ok.

I clearly need some help here. Do you have any idea of what is happening?

Please let me know if you need any further information to investigate the issue.

Best, Octávio Serra

pbasting commented 3 years ago

Thanks,

Preston

OctavioSerra commented 3 years ago

Hi Preston, thank you so much for the quick reply.

Here is my ngs_te_mapper2 log file:

ngs_te_mapper2.log

Regarding popoolationTE2, i did notice that temp files are created inside results/popoolationTE2 during the run, but the directory becomes empty when the run ends. So i guess they are created and used for the summary and then somehow deleted?

My git commit hash is: 5849097de4f74b0b8b149cad138e31024082924c

Here is my full command for 8 samples that I am interested in: for i in {1..8}; do python3 /opt/mcclintock/mcclintock.py -r PyrusCommunis_BartlettDHv2.0.fasta -c TE_consensus_Pyrus_communis.fa -1 "fastq_files/clone"$i"_R1.fq.gz" -2 "fastq_files/clone"$i"_R2.fq.gz" -m popoolationte2,retroseq,ngs_te_mapper2 -p 60 -o results/ --resume &>mcclintock.log; done

Hope this helps. Feel free to ask anything else you may need.

Octávio Serra

pbasting commented 3 years ago

Hi @OctavioSerra ,

Unfortunately, I am unable to determine what is going on based on the log files. Nothing stands out as being the cause of these problems.

I just merged a relatively sizeable update to mcclintock (https://github.com/bergmanlab/mcclintock/commit/77a65fb8fa05dfdb082695f215f48e2b473dd735) so I'd suggest trying to run the newest version on your samples. If the problems persist, then I would probably have to try to run one of your samples myself (if you are willing to share it or if it is public already) to see exactly what is actually going on. If you are OK with sharing the input files for one of the samples giving you problems, you can contact me at: preston.basting@uga.edu to discuss how to transfer the files.

Thanks,

Preston

OctavioSerra commented 3 years ago

Hi @pbasting, sorry for the late reply. I will try this now with the newest version and I will let you know if the issue continues.

Thank you very much,, Octávio Serra

cbergman commented 2 years ago

@OctavioSerra: did you resolve this issue? If no, could you please do a clean install and try again? If yes, I'd like to close this issue. Thanks!

OctavioSerra commented 2 years ago

Hi @cbergman. Sorry for not giving feedback.

The issue still happened with the newest version of mcclintock at that date. Eventually I overcome this issue by running one method at a time. That is, first run with ngs_te_mapper2, then second run with popoolationTE2, etc.

This way I did not get the summary comparison between methods, but at least this allowed me to get the result files I was expecting for each of the tools.

Since then I did not run mcclintock again, so I am unable to say if the issue remains with newer versions or not.

But it is "solved" for me, so in my opinion you can close it.

Octávio Serra