UcarLab / AMULET

A count based method for detecting doublets from single nucleus ATAC-seq (snATAC-seq) data.
https://ucarlab.github.io/AMULET/
GNU General Public License v3.0
29 stars 5 forks source link

Use with outputs of scATAC-pro #20

Open jonhsussman opened 2 years ago

jonhsussman commented 2 years ago

Hello,

I am wondering whether there is a straightforward way to use AMULET with the output of scATAC-pro. The vignettes seem to require the outputs from cellranger which conveniently produce a csv file. scATAC-produces cell_barcodes.bam file, is there a way to convert this to csv file to be in a form to work with AMULET.

Thanks, Jonathan

ajt986 commented 2 years ago

Hi Jonathan,

For AMULET, you just need to provide the fragments (either the .bam file or .txt/tsv.gz file in the same format as the fragments.tsv.gz file from CellRanger and the CSV file with the barcodes. The CSV file can be whatever you want as long as it has a header and the following columns: 1) barcode 2) iscell_barcode . The 'is__cell_barcode' column is essentially just a column where it's 1 if the barcode corresponds to a barcode used in the analysis and anything else (e.g., 0) if it's not to be included. The python reader just looks for these column names to identify them. For the bam file reader, you just need to provide the column indices using '--cellidx' and '--iscellidx'. For example, if you have a csv file where the first column is your 'barcode' and the second is the 'iscell_barcode', you would add --cellidx 0 --iscellidx 1.

You'll need to check what's available from scATAC-pro if you need to convert the fragment/bam files. It is more involved but essentially the bam file is just a typic paired-end ATAC-seq bam file with an additional attribute that stores the barcode. The default is "CB" from cellranger, and this is what AMULET looks for. If there's a different attribute name, you can specify with the --bambc option. The fragment file just needs to be the same format as specified here: https://support.10xgenomics.com/single-cell-atac/software/pipelines/latest/output/fragments

As long as the barcodes match between the fragment/bam file and the csv file, AMULET should be good to go.

Best, Asa

jonhsussman commented 2 years ago

Hi Asa,

Thank you for your suggestions here, I will look into these comments carefully.

--Jonathan