GreenleafLab / ArchR

ArchR : Analysis of Regulatory Chromatin in R (www.ArchRProject.com)
MIT License
388 stars 140 forks source link

`getInputFiles()` cannot import `fragments.tsv.gz` #1532

Closed jiangpuxuan closed 2 years ago

jiangpuxuan commented 2 years ago

I ran cellranger-atac to get a standard 10x outs. Here is my files in outs:

analysis        filtered_peak_bc_matrix     filtered_tf_bc_matrix.h5  peak_annotation.tsv     possorted_bam.bam      raw_peak_bc_matrix.h5  summary.csv
cloupe.cloupe   filtered_peak_bc_matrix.h5  fragments.tsv.gz          peak_motif_mapping.bed  possorted_bam.bam.bai  singlecell.csv         summary.json
data_processed  filtered_tf_bc_matrix       fragments.tsv.gz.tbi      peaks.bed               raw_peak_bc_matrix     singlecell.csv.back    web_summary.html

But when I use getInputFiles() to import the ' fragments.tsv.gz ', it got an error:

#Get Input Fragment Files
inputFiles <- getInputFiles("~/outs/fragments.tsv.gz")
inputFiles

ArrowFiles <- createArrowFiles(
  inputFiles = inputFiles,
  minTSS = 1, #Dont set this too high because you can always increase later
  minFrags = 1000, 
  addTileMat = TRUE,
  addGeneScoreMat = TRUE,
  #validBarcodes = barcode
)
proj_crca001 <- ArchRProject(ArrowFiles)

# `inputFiles` is empty
#Error in .validInput(input = sampleNames, name = "sampleNames", valid = c("character")) : Input value for 'sampleNames' is not a character, (sampleNames = NULL) please supply valid input!

The inputFiles is empty. And nothing is imported .

I tried to mv fragments.tsv.gz and fragments.tsv.gz.tbi to another seperate directory, but the same error.

I should change the path to ~/outs that my inputFiles could not be empty. But that pipeline will get error in createArrowFiles(). Here is my log:ArchR-createArrows-23328111e041e-Date-2022-07-31_Time-11-06-42.log

inputFiles <- getInputFiles("~/outs")
inputFiles
#possorted 
#"~/outs/possorted_bam.bam" 

Thank you for your help!

rcorces commented 2 years ago

Hi @jiangpuxuan! Thanks for using ArchR! Please make sure that your post belongs in the Issues section. Only bugs and error reports belong in the Issues section. Usage questions and feature requests should be posted in the Discussions section, not in Issues.
Before we help you, you must respond to the following questions unless your original post already contained this information: 1. If you've encountered an error, have you already searched previous Issues to make sure that this hasn't already been solved? 2. Can you recapitulate your error using the tutorial code and dataset? If so, provide a reproducible example. 3. Did you post your log file? If not, add it now. 4. Remove any screenshots that contain text and instead copy and paste the text using markdown's codeblock syntax (three consecutive backticks). You can do this by editing your original post.

rcorces commented 2 years ago

I have clarified the functionality of getInputFiles() via https://github.com/GreenleafLab/ArchR/commit/b00ce31a22edad4edfc1a810db6330088abb5d42

Only files that match ".fragments.tsv.gz" and ".bam" will be captured. This is because the prefix of the file is used to create the name of the sample.