hyunhwan-jeong / CB2

CB2 is an R package which provides functions for hit gene identification and quantification of sgRNA (single-guided RNA) abundances for CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) pooled screen data analysis. Details are in Jeong et al. (2019) <doi:10.1101/gr.245571.118> and Baggerly et al. (2003) <doi:10.1093/bioinformatics/btg173>.
https://cran.r-project.org/web/packages/CB2/index.html
Other
7 stars 1 forks source link

inconsistent between .fq.gz and .fastq? #4

Closed AMChalkie closed 4 years ago

AMChalkie commented 4 years ago

Strange as this may sound, I get different results when using a .fq.gz vs a .fastq file (when the .fastq is a zcat of original). I get different mappability results (0.018 ish vs the expected 85%).

Would be great if this was handled automatically and input from .fq.gz was handled correctly, or alternatively at least a warning that this might be what's happening.

Otherwise - very useful software thank you!

hyunhwan-jeong commented 4 years ago

Dear @AMChalkie,

The current version does not support gzipped FASTQ file, so you need to unzip the files. But the suggestion will be great for the CB^2. I will start to add the function soon.

Thank you,

Hyun-Hwan Jeong

hyunhwan-jeong commented 4 years ago

@AMChalkie, the current github version supports gzipped FASTQ files. I wonder you are willing to test the version and let me know if you have any problem with that.

Thank you!

Hyun-Hwan Jeong

AMChalkie commented 4 years ago

Yes that works fine now. I notice that mappability information is not printed during the process now, which I thought was a good feature previously.

hyunhwan-jeong commented 4 years ago

@AMChalkie you should see the message if you set the verbose parameter as TRUE.

Thank you,

Hyun-Hwan Jeong

hyunhwan-jeong commented 4 years ago

@AMChalkie, Can you test on it again? The version which you tested has an incompatibility issue with Windows OS. So, I made some changes and wonder the change doesn't cause any problem.

Thank you,

Hyun-Hwan Jeong

monovich commented 2 years ago

Was just experimenting with .fq.gz vs .fastq inputs and it looks like .fastq.gz is not accepted for some reason. I think it would make sense to add support for it as .fq.gz and .fastq.gz are both fairly common.

Edit: Started with a fresh R environment and I can't reproduce this behavior. I suspect it was an environment path issue not a CB2 issue. Sorry for the false alarm.

hyunhwan-jeong commented 2 years ago

Not a problem, and as you figured out, it has to support both - since CB2 clarifies whether it is gzipped by checking file names' suffix:

https://github.com/LiuzLab/CB2/blob/57d7fd5548c162b175a9dc6f5a57f1f73822eb0d/R/helpers.R#L69