adnaniazi / tailfindr

An R package for estimating poly(A)-tail lengths in Oxford Nanopore RNA and DNA reads.
https://www.cbu.uib.no/valen/
GNU General Public License v3.0
48 stars 15 forks source link

Wrong read ID output when using custom cDNA adapter mode #24

Closed obegik closed 2 years ago

obegik commented 2 years ago

Hi Adnan,

I came across with a problem when I used custom cDNA adapter mode. In first column, I realised that some read IDs contained an extra "read" text, which resulted in invalid estimation.

Thanks, Oguzhan

image
adnaniazi commented 2 years ago

Hi Oguzgan,

Thank you for using tailfindr.

Tailfindr expects you to provide it either single-read Fast5 files, or multi-read Fast5 files, but not a mixture of both. In case of a mixture, it will work correctly for either single-read Fast5 or multi-read Fast5, but not both.

So you should segregate these two different file types in separate folders and then run tailfindr separately on them.

obegik commented 2 years ago

Hi Adnan,

Thanks for your prompt reply. I actually doubt that this is the issue because when I ran tailfindR before with default settings (without providing the adapter sequence), I did not have this issue. This only happened when I introduced the adapter sequence.

Thanks, Oguzhan

adnaniazi commented 2 years ago

Can you email me on adnaniazi@gmail.com some example data and the command that you are using. I will try to debug it.

adnaniazi commented 2 years ago

I think the issue is as I described above in my earlier comment. I designed tailfindr in a way that it first detects the type of the Fast5 file (single-read or multiread) by reading just one random read from all the available FAST5 files in the top-level dir, and then depending filetype detected (single or multi-read), dispatching all the reads to either single-read or multi-read processing scripts. Now if you are providing a mix of single and multi-read files, tailfindr will only work for one and fail completely for the other type. This is what is happening in your case, in all likelihood. So just rerun tailfindr separately on a folder that contains single fast5 files, and then separately on another folder containing multi-read FAST5 files, and see what happen.

obegik commented 2 years ago

Hi Adnan,

I will send you some data during the day. The issue is that, this happens when I run only one FAST5. So within one FAST5 file, i encounter this problem. So I am not sure that this is the reason.

adnaniazi commented 2 years ago

Okay. It could be a bug. Just copy and paste this one FAST5 file and create a duplicate and run tailfindr again now on the two files, and see what happens.

obegik commented 2 years ago

I am sorry I can't see the FAST5 file