Closed Austin-s-h closed 1 year ago
This is because the files you've referenced above are not the right files. They correspond to our targeted amplicon sequencing experiments, and are not CHANGE-seq FASTQ files. The CHANGE-seq runs are classified as 'OTHER' for the purposes of NCBI SRA and are listed in the same project.
Hello and thank you for developing CHANGE-Seq!
I wanted to reach out because I am having issues recreating the off-target results (specifically TRAC site 2 and CTL4 site 9) as presented in Figure 4F of the Nature Biotechnology publication. From SRA, I downloaded (what I think) are the appropriate files
Processing via CHANGE-Seq v1.2.9.1 using the default arguments results in the pipeline crashing after alignment. The alignment files look okay (reasonable size compared to successful samples), but the _CONTROL, _NUCLEASE, and _count.txt files are all very small (~1/100th the size of a successful sample). I attempted to combine the replicates at the FASTQ level but still ended with the same results. In addition, it appears that there are no real differences between the control and nuclease samples, as read counts are similar at the positions that are collected. The end result is no
_identified_matched.txt
file being created, or if there is one it is empty.Could you help provide any information that may help recreate this off-target result? Are these input files correct? What steps within the identification script may result in this behavior?
Thank you very much!
Error logging
head of _CONTROL_coordinates.txt
head of _NUCLEASE_coordinates.txt
head of _count.txt