no read after SplitBamCellTypes.py

YoonheeJ commented 1 year ago

Hello,

I am trying to run SplitBamCellTypes.py , but I've got no Pass_read after run. Total_reads Pass_reads Reads_without_cell_type Reads_without_CB Total_time 996963314 0 372364356 37236768 2061.52

Could you please let me know what was the issue?

Thank you,

Francesc-Muyas commented 1 year ago

Dear user, What aligner did you use in your pipeline?

The SplitBamCellTypes.py script assumes that the aligner provides the CB (cell barcode), nM (number of mismatches) and NH (number of hits). The first one is used to get the cell barcode, and the last two tags are used to remove low-quality reads (reads with too many mismatches, or mapping to more than one genomic location). Therefore, if reads do not have these tags, they are filtered out.

However, we know that these tags are not provided by all aligners (nM and NH), so we are planning to update this part of the script (in the coming days) to allow the user to run the tool without these tags.

Cheers, Fran

YoonheeJ commented 1 year ago

Dear Fran,

Thank you for the quick reply, I used scATAC-seq library aligned by CellRanger. In my bam files, there's no those tags, as you mentioned.

I will look forward to updating!

Thanks, Yoonhee

Francesc-Muyas commented 1 year ago

Dear Yoonhee,

We have updated the scripts/SplitBam/SplitBamCellTypes.py script to permit SComatic to work without nM and NH tags. However, we strongly suggest using the _--maxnM 5 and _--maxNH 1 parameters whenever possible to remove low-quality reads and achieve similar performances as described in the SComatic manuscript. In addition, the new script version provides a more detailed report file (*.report.txt) with the number of PASS and filtered reads, as well as the reason why the reads were filtered out.

Thanks for your feedback, Fran

YoonheeJ commented 1 year ago

Thank you Fran!!

moshl commented 1 year ago

Thank you Fran!!

Hi, Yoonhee. I also meet the same problem. After updating the scripts, I still got no Pass_read. Do you solve it?

Francesc-Muyas commented 1 year ago

Dear user,

Could you please share the command that you used for running this step? And what type of data are you working with, and how the files were processed (aligned)? In theory the problem should be solved.

Cheers, Fran

moshl commented 1 year ago

Dear user,

Could you please share the command that you used for running this step? And what type of data are you working with, and how the files were processed (aligned)? In theory the problem should be solved.

Cheers, Fran

Hi, Fran. Yes, I solved it. I just preprocess the scATAC-seq by cellranger. I am confused that the NM tag is in the bam files, but the parameter nM does not work. Looking forward your reply!

Best wishes, Mo Mo

Francesc-Muyas commented 1 year ago

Hi Mo, In the current version of SComatic, if you do not have the exact nM or NH tags in the bam files, you should not use these filters (mainly designed for scRNA-seq data). However, we are currently working to implement the possibility of using NM (scATAC-seq) or nM (scRNA-seq) depending on the input bam file. I hope we can update the code in the coming days to do so.

Thanks for your feedback, Fran

moshl commented 1 year ago

Hi Mo, In the current version of SComatic, if you do not have the exact nM or NH tags in the bam files, you should not use these filters (mainly designed for scRNA-seq data). However, we are currently working to implement the possibility of using NM (scATAC-seq) or nM (scRNA-seq) depending on the input bam file. I hope we can update the code in the coming days to do so.

Thanks for your feedback, Fran

Hi Fran, sorry for the late feedback. I just update the splitbam script, but it don't work. And the code for the paremeter--max_nM still report the same error.

Francesc-Muyas commented 1 year ago

Dear user, We still need to update this part of the script. We are currently working on it but must perform sanity checks before publicly uploading the new updates. Until that, and if working with scATAC-seq data, please do not use the NM and NH filters as described in the documentation.

We will announce this update when available.

Thanks for your time, Fran

cortes-ciriano-lab / SComatic

no read after SplitBamCellTypes.py #3