Closed rlegendre closed 1 year ago
I meant to remove the wide output option and switch exclusively to the long format so I haven't been checking it while updating and fixing things other things.
I will look into it. But can you also check to see if the bug exists in the long (default) form of the output?
Without a valid bam file (not just the read but a file with a real header), and the CTCF.bed file I am unable to reproduce or debug your issue. Feel free to reopen if you can upload these files.
Thanks for your answer. I tested without the wide option (which, in my own opinion, is a very useful option), and the results are still incorrect.
I have prepared an archive with a part of my BAM file and some CTCF sites, on which I am be able to reproduce the error with both options. The data are available here: https://dl.pasteur.fr/fop/25g5TQgh/Test_data.tar
Thank you for your help.
Hi @rlegendre,
I can help but I need a smaller dataset, this is going to take > 2 hours to download, and will be hard for me to identify reads with issues.
Ideally, the subseted bam and bed would have only 1-2 CTCF sites and less than 5 reads per site.
Sorry for being picky about test cases, but to make solving these issues productive I need to build unit tests that require small files I can add to the repo and test with every change. I hope you understand.
Thanks, Mitchell
P.S. good to hear you want/like the wide format, I will try to keep it.
The download eventually finished but I am unable to reproduce your results. Here is an example of my results (top) vs yours (bottom):
I think your version of fibertools might be out of date.
Indeed I've install fibertools via conda, I will try by cloning the last repository, thanks
please reopen with the ft version if this doesn't fix it for you. cheers!
First, thanks to provides your tool to analyze fiberseq. Second, I use
ft center
to center my fibers around CTCF and I see some wrong positions in the output file: here my input read (from my bam file): ccs1.txthere the wrong output:
As you can see, the first positions of centered m6A positions correspond to start+1 of my region of interest, then they are correct.
here my command line:
ft center fiberseq-smk/sample_.fiberseq.bam CTCF.bed -t 48 -w --reference > sample_center_ctcf.txt
(fibertools ran on HPC serveur, on Red Hat Enterprise Linux 8.6)Thanks for your help to correct this bug. Best Rachel