Closed gouxiaojuan closed 6 months ago
@gouxiaojuan Sorry, I just see your issue.
How to get fastq files to fragment and other files? i) We do alignment from FASTQ to bam file by using bwa with mm10. ii) Then we use SnapATAC2 to directly load bam files to get fragment file. You can read this: get fragment
Is there a tutorial for this? No. But I put all the codes in the directory of ~00.data.preprocess~, you can scan the scripts there. I did not organize this directory that well, feel free to let me know if you have further questions.
About fragments files.
Thanks! Songpeng
Hi @beyondpie,
I am wondering if further filtering/processing of fragments files or raw BAM files were performed in addition to the two QC steps:
which removed, as stated in the paper, "7% of nuclei that were deemed to be potential doublets”. I have generated class-level fragments files using sa2.pp.make_fragment_file()
as in the script you posted above, and there seems to be more fragments per class in Supplementary Table 2 than in the class-level fragments files from pp.make_fragment_file(), which I used on the raw BAM files you uploaded earlier. Could you please clarify how exactly the “# of Fragments” column in Supplementary Table 2 was generated?
I have a few additional questions:
sa2.pp.make_fragment_file()
?
chrX 138383596 107973759 CEMBA200827_7H.ATGGTTTGGGCGCGACTTGAGA 5
Best, Jay
@beyondpie When counting the number of occurrences of barcodes in the corresponding sample fragments file, I encounter off-by-one discrepancies:
Sanity check failed: CEMBA201210_10D.TGGTGCGCATGTACAACTCTAG. Metadata says 25209 but .tsv file says 25208
Sanity check failed: CEMBA181023_6B.AAGCAAAGTCACTCTTCCTCAT. Metadata says 6311 but .tsv file says 6312
Thanks, Jay
@jayluo2
I also noticed that there is a “bam2bedpe” functionality here. Is this how you would recommend generating bedpe files from raw BAM files?
My college previously generated the bedpe files using the codes here: https://github.com/beyondpie/CEMBA_wmb_snATAC/blob/543bf5c73a6f34638bfcdff8fab9400d391598ae/00.data.preprocess/src/main/pipeline/alignment.Snakefile#L255 I haven't run this part, and if you still need the bedpe files, I would suggest you follow the codes here. And if you have problems, let's have a discussion then.
Sincerely, Songpeng
@beyondpie Either fragments files (but shifted slightly differently, as we discussed) or bedpe files will work for me. Since sa2.pp.make_fragment_file()
already has functionality to change fragment start/end shifting, perhaps we can stick with fragments files for now?
Best, Jay
@jayluo2 I now close this issue. If you have further questions, just let me know. Thanks! Songpeng
Hello, may I ask what process is used in this document https://www.nature.com/articles/s41586-023-06824-9#data-availability from fastq files to fragments and other files? Is there a corresponding tutorial? Thank you, because now I need files such as fragments provided by 10× company, but I did not find them in the data you provided. Or can you provide the fragment files of each sample? Thank you very much!