Closed JasperYH closed 4 years ago
input_file <- "LIBILCM_9.txt"
name <- strsplit(input_file,"\\.")[[1]][1]
out_file <- paste("./cache/traceQC_obj/",name,".rds",sep="")
if (!file.exists(out_file)) {
print(name)
obj <- create_TraceQC_object(aligned_reads_file = paste("./cache/alignment/",input_file,sep=""),
ref_file = ref_file,
fastqc_file = paste("./qc_report/",name,"_fastqc.zip",sep=""),
alignment_score_threshold = 200,
abundance_threshold = 10,
ncores = 1)
saveRDS(obj,file=out_file)
Hi @JasperYH, I ran the script (with small modifications), and I did not see any error.
setwd("/mnt/hdd/jasper/CRISPR_barcode/Shawn3")
input_file <- "LIBILCM_9.txt"
ref_file <- "/home/jasper/hdd/CRISPR_barcode/Shawn3/data/ref21.txt"
name <- strsplit(input_file,"\\.")[[1]][1]
print(name)
obj <- TraceQC::create_TraceQC_object(aligned_reads_file = paste("./cache/alignment/",input_file,sep=""),
ref_file = ref_file,
fastqc_file = paste("./qc_report/",name,"_fastqc.zip",sep=""),
alignment_score_threshold = 200,
abundance_threshold = 10,
ncores = 1)
saveRDS(obj,file=out_file)
Can you try it once you reinstall the package by devtools::install_github("https://github.com/LiuzLab/TraceQC")
?
Can you check the output file to see if the filter works?
I don't think so
> max(obj$aligned_reads$score)
[1] 293.4
> min(obj$aligned_reads$score)
[1] -29.6
Just checked obj$mutation
, but it seems the filter is not working
> summary(obj$mutation)
target_seq
AACTTGAAAG------------------------ATC---------GGAAGAGCACACGTCTGAACTCCAGT------CACCGCTCA------TTATCGCGTATGCCGTCTTCTGCTTGAA----------AAAATAAG: 17
AACTTGAAAG------------------------ATC---------GGAAGAGCACACGTCTGAACTCCAGT------CACCGCTCA------TTATCTCGTATGCCGTCTTCTGCTTGAAAAA--AA-----------: 17
AACTTGAAAG------------------------ATC---------GGAAGAGCACACGTCTGAACTCCAGT------CACCGCTCA------TTATCTCGTATGCCGTCTTCTGCTTGAAAAA-----------AAAG: 17
AACTTGAAAG------------------------ATC---------GGAAGAGCACACGTCTGAACTCCAGT------CACCGCTCA------TTATCTCGTATGCCGTCTTCTGCTTGAA----------AAAATAAG: 17
AACTTGAAAG------------------------ATC---------GGAAGAGCACACGTCTGAACTCCAGT------CACCGCTCA------TTATCACGTATGCCGTCTTCTGCTTGAA---------------AAA: 16
AACTTGAAAG------------------------ATC---------GGAAGAGCACACGTCTGAACTCCAGT------CACCGCTCA------TTATCGCGTATGCCGTCTTCTGCTTGAAAAA---------------: 16
(Other) :4651
alignment_score type start length mutate_to count
Min. : 60.9 unmutated: 1 Min. : 0.00 Min. : 0.000 G :1070 Min. : 24
1st Qu.: 79.0 deletion : 996 1st Qu.: 61.00 1st Qu.: 1.000 - : 997 1st Qu.: 322
Median :246.8 mutation :2735 Median : 63.00 Median : 1.000 A : 836 Median : 614
Mean :205.2 insertion:1019 Mean : 64.96 Mean : 6.741 C : 530 Mean : 5889
3rd Qu.:264.5 3rd Qu.: 83.00 3rd Qu.:14.000 AGACG : 471 3rd Qu.: 1989
Max. :293.4 Max. :112.00 Max. :97.000 AGCAAACGTTTGGA: 356 Max. :1923186
(Other) : 491
When I run the code directly (not using the function), it works well.
Can you share your output as the same as I did?
I added some print()
for a debug purpose (not commit this modification) as follows:
print(summary(aligned_reads))
print(nrow(aligned_reads))
print(aligned_reads %>% filter(score>alignment_score_threshold) %>% nrow())
aligned_reads <- aligned_reads %>%
filter(score>alignment_score_threshold) %>%
group_by(target_seq,target_ref) %>%
summarise(count=n(),score=max(score)) %>%
ungroup
print(sum(aligned_reads$count))
Here is the output:
name seq ref score target_ref target_seq spacer_ref
Length:7770926 Length:7770926 Length:7770926 Min. :-29.6 Length:7770926 Length:7770926 Length:7770926
Class :character Class :character Class :character 1st Qu.:238.8 Class :character Class :character Class :character
Mode :character Mode :character Mode :character Median :250.8 Mode :character Mode :character Mode :character
Mean :213.9
3rd Qu.:264.5
Max. :293.4
spacer_seq PAM_ref PAM_seq
Length:7770926 Length:7770926 Length:7770926
Class :character Class :character Class :character
Mode :character Mode :character Mode :character
[1] 7770926
[1] 7758082
[1] 7758082
[1] 6805198
I finally found why. The augments were swapped.
In TraceQC.R
:
In mutation_events.R
:
It has been solved, and I will commit the fix: Let's place alignment_score_threshold
first.
The abundance_threshold and alignment_score_threshold filter doesn't work. I'm not sure why.