cellgeni / cellatac

Sanger Cellular Genetics single-cell ATAC-seq pipeline.
GNU General Public License v3.0
10 stars 6 forks source link

Clustering step failing on cellranger-arc mapping output #6

Open egerc opened 2 weeks ago

egerc commented 2 weeks ago

Dear cellatac devs,

When running cellatac on a single sample of my cellranger-arc mapping output I'm running into the following error during the clustering step of the script (specifically at this step: so <- RunTFIDF(so)). Using 'cusanovich' and 'episcanpy' similarily failed the script at the clustering step. I also tried running ca_seurat_clades.R locally on my laptop using the files (cell.names, filtered_window_bc_matrix.mmtx.gz, regions.names) present in the mmtx folder at which point the script fails when creating the seurat object.

Converting f_binary_mat like this

f_binary_mat <- as(f_binary_mat, "dgCMatrix")

or this

f_binary_mat <- as.matrix(f_binary_mat)

didnt help either

Any help as per what I'm doing wrong would be appreciated :).

encountered error ``` > so <- RunTFIDF(so) Performing TF-IDF normalization Error in checkSlotAssignment(object, name, value) : assignment of an object of class “dgeMatrix” is not valid for slot ‘data’ in an object of class “Assay”; is(value, "AnyMatrix") is not TRUE Calls: RunTFIDF ... SetAssayData.Assay -> slot<- -> checkSlotAssignment Execution halted INFO: Cleaning up image... ```
I ran cellatac using the following script: ```bash rm -rf .nextflow reports results work rm .nextflow.log source=/home/ceger/cellatac/ sample_dir=/mnt/LaCIE/ceger/Projects/human_heart_mapping/backup/.data/mapping_py/HCAHeart9508627_HCAHeart9508819/HCAHeart9508627_HCAHeart9508819/outs/ manifest="$sample_dir"/"singlecell.csv" posbam="$sample_dir"/"atac_possorted_bam.bam" fragments="$sample_dir"/"atac_fragments.tsv.gz" chromlen=chromlen.txt cellbatchsize=400 nclades=10 nextflow run $source \ --cellcsv $manifest \ --fragments $fragments \ --cellbatchsize $cellbatchsize \ --posbam $posbam \ --chromlen $chromlen --outdir results \ --sampleid HCAHeart9508627_HCAHeart9508819 \ -profile local \ --mermul true \ --usecls '__seurat__' \ --mergepeaks true \ -with-report reports/report.html \ -resume -w work -ansi-log false \ -config my.config ```
.nextflow.log ```log Jul-10 10:14:58.927 [main] DEBUG nextflow.cli.Launcher - $> nextflow run /home/ceger/cellatac/ --cellcsv /mnt/LaCIE/ceger/Projects/human_heart_mapping/backup/.data/mapping_py/HCAHeart9508627_HCAHeart9508819/HCAHeart9508627_HCAHeart9508819/outs//singlecell.csv --fragments /mnt/LaCIE/ceger/Projects/human_heart_mapping/backup/.data/mapping_py/HCAHeart9508627_HCAHeart9508819/HCAHeart9508627_HCAHeart9508819/outs//atac_fragments.tsv.gz --cellbatchsize 400 --posbam /mnt/LaCIE/ceger/Projects/human_heart_mapping/backup/.data/mapping_py/HCAHeart9508627_HCAHeart9508819/HCAHeart9508627_HCAHeart9508819/outs//atac_possorted_bam.bam --chromlen chromlen.txt Jul-10 10:14:58.956 [main] INFO nextflow.cli.CmdRun - N E X T F L O W ~ version 22.10.6 Jul-10 10:14:58.967 [main] DEBUG nextflow.plugin.PluginsFacade - Setting up plugin manager > mode=prod; embedded=false; plugins-dir=/home/ceger/.nextflow/plugins; core-plugins: nf-amazon@1.11.3,nf-azure@0.14.2,nf-codecommit@0.1.2,nf-console@1.0.4,nf-ga4gh@1.0.4,nf-google@1.4.5,nf-tower@1.5.6,nf-wave@0.5.3 Jul-10 10:14:58.972 [main] INFO org.pf4j.DefaultPluginStatusProvider - Enabled plugins: [] Jul-10 10:14:58.973 [main] INFO org.pf4j.DefaultPluginStatusProvider - Disabled plugins: [] Jul-10 10:14:58.974 [main] INFO org.pf4j.DefaultPluginManager - PF4J version 3.4.1 in 'deployment' mode Jul-10 10:14:58.979 [main] INFO org.pf4j.AbstractPluginManager - No plugins Jul-10 10:14:59.378 [main] DEBUG nextflow.config.ConfigBuilder - Found config base: /home/ceger/cellatac/nextflow.config Jul-10 10:14:59.379 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /home/ceger/cellatac/nextflow.config Jul-10 10:14:59.384 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `standard` Jul-10 10:14:59.431 [main] DEBUG nextflow.cli.CmdRun - Applied DSL=1 by probing script field Jul-10 10:14:59.443 [main] INFO nextflow.cli.CmdRun - Launching `/home/ceger/cellatac/main.nf` [insane_lamarr] DSL1 - revision: 0eb968d72b Jul-10 10:14:59.444 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins default=[] Jul-10 10:14:59.444 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins resolved requirement=[] Jul-10 10:14:59.448 [main] DEBUG nextflow.secret.LocalSecretsProvider - Secrets store: /home/ceger/.nextflow/secrets/store.json Jul-10 10:14:59.450 [main] DEBUG nextflow.secret.SecretsLoader - Discovered secrets providers: [nextflow.secret.LocalSecretsProvider@4cbc2e3b] - activable => nextflow.secret.LocalSecretsProvider@4cbc2e3b Jul-10 10:14:59.506 [main] DEBUG nextflow.Session - Session UUID: 4f5b8ef8-6140-44ec-a5cd-306e767d74bf Jul-10 10:14:59.506 [main] DEBUG nextflow.Session - Run name: insane_lamarr Jul-10 10:14:59.506 [main] DEBUG nextflow.Session - Executor pool size: 64 Jul-10 10:14:59.528 [main] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'FileTransfer' minSize=10; maxSize=192; workQueue=LinkedBlockingQueue[10000]; allowCoreThreadTimeout=false Jul-10 10:14:59.550 [main] DEBUG nextflow.cli.CmdRun - Version: 22.10.6 build 5843 Created: 23-01-2023 23:20 UTC (24-01-2023 00:20 CEST) System: Linux 6.9.7-100.fc39.x86_64 Runtime: Groovy 3.0.13 on OpenJDK 64-Bit Server VM 17.0.11-internal+0-adhoc..src Encoding: UTF-8 (UTF-8) Process: 3992615@carlos [169.254.3.1] CPUs: 64 - Mem: 755.2 GB (356.5 GB) - Swap: 8 GB (6.9 GB) Jul-10 10:14:59.577 [main] DEBUG nextflow.Session - Work-dir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work [ext2/ext3] Jul-10 10:14:59.594 [main] DEBUG nextflow.executor.ExecutorFactory - Extension executors providers=[] Jul-10 10:14:59.600 [main] DEBUG nextflow.Session - Observer factory: DefaultObserverFactory Jul-10 10:14:59.618 [main] DEBUG nextflow.cache.CacheFactory - Using Nextflow cache factory: nextflow.cache.DefaultCacheFactory Jul-10 10:14:59.624 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 65; maxThreads: 1000 Jul-10 10:15:00.194 [main] DEBUG nextflow.Session - Session start Jul-10 10:15:00.197 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow started -- trace file: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/reports/trace.txt Jul-10 10:15:00.904 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution Jul-10 10:15:01.039 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.039 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.042 [main] DEBUG nextflow.executor.Executor - [warm up] executor > local Jul-10 10:15:01.045 [main] DEBUG n.processor.LocalPollingMonitor - Creating local task monitor for executor 'local' > cpus=64; memory=755.2 GB; capacity=64; pollInterval=100ms; dumpInterval=5m Jul-10 10:15:01.091 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.091 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.097 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.097 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.107 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.107 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.125 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.126 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.136 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.136 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.140 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.140 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.142 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.143 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.145 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.145 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.147 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.147 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.149 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.149 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.152 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.152 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.154 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.155 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.163 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.163 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.163 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:15:01.165 [Task submitter] INFO nextflow.Session - [cd/9be202] Submitted process > prepare_cr_single (cr-prep 400) Jul-10 10:15:01.166 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.166 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.169 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.169 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.173 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.173 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.179 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.179 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.181 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.181 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.184 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null Jul-10 10:15:01.184 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' Jul-10 10:15:01.185 [main] DEBUG nextflow.Session - Workflow process names [dsl1]: prepare_cr_single, make_big_matrix, cusanovich_clustering, prepare_cr_mux, filter_big_matrix, clusters_index, make_subset_peakmatrix, join_muxfiles, peaks_make_masterlist, cells_masterlist_coverage, make_sample_matrix, clusters_macs2, sample_demux, clusters_merge_inputs, mmtx_big_matrix, join_sample_matrix, make_master_peakmatrix, seurat_clustering, prepare_mm, episcanpy_clustering Jul-10 10:15:01.185 [main] DEBUG nextflow.script.ScriptRunner - > Awaiting termination Jul-10 10:15:01.185 [main] DEBUG nextflow.Session - Session await Jul-10 10:15:01.811 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 1; name: prepare_cr_single (cr-prep 400); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/cd/9be2029d258f3299d563cdad309c3b] Jul-10 10:15:01.875 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:15:01.876 [Task submitter] INFO nextflow.Session - [40/2b1ab8] Submitted process > sample_demux (sample-crsingle batch-ah) Jul-10 10:15:01.882 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:15:01.882 [Task submitter] INFO nextflow.Session - [b2/e66411] Submitted process > sample_demux (sample-crsingle batch-ad) Jul-10 10:15:01.888 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:15:01.888 [Task submitter] INFO nextflow.Session - [58/8568c8] Submitted process > sample_demux (sample-crsingle batch-ag) Jul-10 10:15:01.893 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:15:01.894 [Task submitter] INFO nextflow.Session - [4c/bf7058] Submitted process > sample_demux (sample-crsingle batch-ai) Jul-10 10:15:01.898 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:15:01.898 [Task submitter] INFO nextflow.Session - [84/c76ba9] Submitted process > sample_demux (sample-crsingle batch-ae) Jul-10 10:15:01.902 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:15:01.903 [Task submitter] INFO nextflow.Session - [b2/082d6b] Submitted process > sample_demux (sample-crsingle batch-aa) Jul-10 10:15:01.907 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:15:01.907 [Task submitter] INFO nextflow.Session - [98/4385bb] Submitted process > sample_demux (sample-crsingle batch-ac) Jul-10 10:15:01.912 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:15:01.912 [Task submitter] INFO nextflow.Session - [96/492a6e] Submitted process > sample_demux (sample-crsingle batch-ab) Jul-10 10:15:01.916 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:15:01.916 [Task submitter] INFO nextflow.Session - [c5/2695d1] Submitted process > sample_demux (sample-crsingle batch-af) Jul-10 10:19:08.704 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 10; name: sample_demux (sample-crsingle batch-ag); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/58/8568c85d75608ed66f0bf29f50471e] Jul-10 10:19:08.736 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 7; name: sample_demux (sample-crsingle batch-ad); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/b2/e66411919ea1779667814a92f3d9e0] Jul-10 10:19:09.038 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 9; name: sample_demux (sample-crsingle batch-af); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/c5/2695d1388ac7672b1b8434c32da86d] Jul-10 10:19:09.294 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 12; name: sample_demux (sample-crsingle batch-ai); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/4c/bf7058bcb3b17d40858ee43a268620] Jul-10 10:19:10.379 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 6; name: sample_demux (sample-crsingle batch-ac); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/98/4385bbd5ed94924225d6e106e76e2f] Jul-10 10:19:11.080 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 4; name: sample_demux (sample-crsingle batch-aa); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/b2/082d6b1c0f19d1f2c2edca2ef714df] Jul-10 10:19:11.773 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 5; name: sample_demux (sample-crsingle batch-ab); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/96/492a6ea6826f3ac8c7d68996e7e002] Jul-10 10:19:12.210 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 8; name: sample_demux (sample-crsingle batch-ae); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/84/c76ba946338166d602047d1659c055] Jul-10 10:19:12.422 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 11; name: sample_demux (sample-crsingle batch-ah); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/40/2b1ab88f200fd999f78fa7f0f9ddaf] Jul-10 10:19:12.451 [Actor Thread 62] WARN nextflow.container.SingularityCache - Singularity cache directory has not been defined -- Remote image will be stored in the path: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/singularity -- Use env variable NXF_SINGULARITY_CACHEDIR to specify a different location Jul-10 10:19:12.451 [Actor Thread 62] INFO nextflow.container.SingularityCache - Pulling Singularity image docker://quay.io/cellgeni/cellclusterer [cache /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/singularity/quay.io-cellgeni-cellclusterer.img] Jul-10 10:19:12.906 [Actor Thread 58] DEBUG nextflow.sort.BigSort - Sort completed -- entries: 3463; slices: 1; internal sort time: 0.415 s; external sort time: 0.044 s; total time: 0.459 s Jul-10 10:19:12.944 [Actor Thread 58] DEBUG nextflow.file.FileCollector - Saved collect-files list to: /tmp/83b72cf4c1fa6b92be4a5d236641ff6d.collect-file Jul-10 10:19:12.948 [Actor Thread 58] DEBUG nextflow.file.FileCollector - Deleting file collector temp dir: /tmp/nxf-3340720557633142387 Jul-10 10:19:14.933 [Actor Thread 62] DEBUG nextflow.container.SingularityCache - Singularity pull complete image=docker://quay.io/cellgeni/cellclusterer path=/mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/singularity/quay.io-cellgeni-cellclusterer.img Jul-10 10:19:14.943 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:19:14.943 [Task submitter] INFO nextflow.Session - [07/f75f0a] Submitted process > make_big_matrix (1) Jul-10 10:19:14.945 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:19:14.945 [Task submitter] INFO nextflow.Session - [7e/1bbad8] Submitted process > make_sample_matrix (crsingle) Jul-10 10:19:18.146 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 14; name: make_sample_matrix (crsingle); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/7e/1bbad88042d351ca6ec65a9ec8e28a] Jul-10 10:19:18.164 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:19:18.164 [Task submitter] INFO nextflow.Session - [e4/590db8] Submitted process > join_sample_matrix Jul-10 10:19:21.214 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 13; name: make_big_matrix (1); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/07/f75f0a281ceff35382d6298c201f9f] Jul-10 10:19:21.221 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:19:21.221 [Task submitter] INFO nextflow.Session - [31/ff9df1] Submitted process > mmtx_big_matrix (raw_window_bc_matrix) Jul-10 10:19:21.239 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:19:21.239 [Task submitter] INFO nextflow.Session - [da/d7b1c5] Submitted process > filter_big_matrix (bed-mcx-mmtx) Jul-10 10:19:21.282 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 15; name: join_sample_matrix; status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/e4/590db80fa2258cbcdd62aefd6e9288] Jul-10 10:19:21.312 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run Jul-10 10:19:21.313 [Task submitter] INFO nextflow.Session - [74/6e1212] Submitted process > seurat_clustering (seurat2020) Jul-10 10:19:24.658 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 17; name: filter_big_matrix (bed-mcx-mmtx); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/da/d7b1c5ca5d11d4f9c729ccdb2cf6d8] Jul-10 10:19:24.668 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 16; name: mmtx_big_matrix (raw_window_bc_matrix); status: COMPLETED; exit: 0; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/31/ff9df1612f45316458e185a1f04818] Jul-10 10:19:31.280 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 18; name: seurat_clustering (seurat2020); status: COMPLETED; exit: 1; error: -; workDir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/74/6e12124514ab4196e97fe41faf16f7] Jul-10 10:19:31.286 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'seurat_clustering (seurat2020)' Caused by: Process `seurat_clustering (seurat2020)` terminated with an error exit status (1) Command executed: R --no-save --args < /home/ceger/cellatac/bin/ca_seurat_clades.R Command exit status: 1 Command output: Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(Signac) > library(Seurat) > library('Matrix') > library(R.utils) > > theargs <- R.utils::commandArgs(asValues=TRUE) > have_multisample <- is.null(theargs$"single-sample") > do_clip <- is.null(theargs$noclip) # with contortion apologies. > > ### NOTE/check: make sure CR singlecell.csv table works as expected if we merge multiplets ourselves > > # cell.names filtered_cell.stats filtered_window_bc_matrix.mmtx.gz regions.names win.stats > > ### Load cellATAC windows binary matrix > f_binary_mat <- as(readMM(file = 'mmtx/filtered_window_bc_matrix.mmtx.gz'), "dgCMatrix") > regions.names = read.delim('mmtx/regions.names', header = FALSE, stringsAsFactors = FALSE) > cell.names = read.delim('mmtx/cell.names', header = FALSE, stringsAsFactors = FALSE) > colnames(f_binary_mat) = cell.names$V1 > rownames(f_binary_mat) = regions.names$V1 > > ### Load cell calling from cellranger - Seurat likes this > metadata <- read.csv(file = 'singlecell.csv', header = TRUE, row.names=1, sep="\t") > > ## Filter metadata to match cells in f_binary_mat > #metadata <- metadata[colnames(f_binary_mat), , drop=FALSE] > # > ## Ensure metadata only contains cells present in the count matrix > #common_cells <- intersect(colnames(f_binary_mat), rownames(metadata)) > #f_binary_mat <- f_binary_mat[, common_cells, drop=FALSE] > #metadata <- metadata[common_cells, , drop=FALSE] > # Check for zero rows and columns in the matrix > > ### Creating a Seurat object using the windows/cell matrix > so <- CreateSeuratObject( + counts = f_binary_mat, + assay = 'peaks', + project = 'ATAC', + min.cells = 1, + meta.data = metadata + ) > > ### Normalization - term frequency-inverse document frequency (TF-IDF) > # is a two-step normalization procedure, > # that both normalizes across cells to correct for differences in cellular sequencing depth, > # and across peaks to give higher values to more rare peaks > so <- RunTFIDF(so) Command error: warnings > > theargs <- R.utils::commandArgs(asValues=TRUE) > have_multisample <- is.null(theargs$"single-sample") > do_clip <- is.null(theargs$noclip) # with contortion apologies. > > ### NOTE/check: make sure CR singlecell.csv table works as expected if we merge multiplets ourselves > > # cell.names filtered_cell.stats filtered_window_bc_matrix.mmtx.gz regions.names win.stats > > ### Load cellATAC windows binary matrix > f_binary_mat <- as(readMM(file = 'mmtx/filtered_window_bc_matrix.mmtx.gz'), "dgCMatrix") > regions.names = read.delim('mmtx/regions.names', header = FALSE, stringsAsFactors = FALSE) > cell.names = read.delim('mmtx/cell.names', header = FALSE, stringsAsFactors = FALSE) > colnames(f_binary_mat) = cell.names$V1 > rownames(f_binary_mat) = regions.names$V1 > > ### Load cell calling from cellranger - Seurat likes this > metadata <- read.csv(file = 'singlecell.csv', header = TRUE, row.names=1, sep="\t") > > ## Filter metadata to match cells in f_binary_mat > #metadata <- metadata[colnames(f_binary_mat), , drop=FALSE] > # > ## Ensure metadata only contains cells present in the count matrix > #common_cells <- intersect(colnames(f_binary_mat), rownames(metadata)) > #f_binary_mat <- f_binary_mat[, common_cells, drop=FALSE] > #metadata <- metadata[common_cells, , drop=FALSE] > # Check for zero rows and columns in the matrix > > ### Creating a Seurat object using the windows/cell matrix > so <- CreateSeuratObject( + counts = f_binary_mat, + assay = 'peaks', + project = 'ATAC', + min.cells = 1, + meta.data = metadata + ) > > ### Normalization - term frequency-inverse document frequency (TF-IDF) > # is a two-step normalization procedure, > # that both normalizes across cells to correct for differences in cellular sequencing depth, > # and across peaks to give higher values to more rare peaks > so <- RunTFIDF(so) Performing TF-IDF normalization Error in checkSlotAssignment(object, name, value) : assignment of an object of class “dgeMatrix” is not valid for slot ‘data’ in an object of class “Assay”; is(value, "AnyMatrix") is not TRUE Calls: RunTFIDF ... SetAssayData.Assay -> slot<- -> checkSlotAssignment Execution halted INFO: Cleaning up image... Work dir: /mnt/LaCIE/ceger/Projects/human_heart_mapping/human_heart_mapping/0-raw_data_processing/3-240417-E-MTAB-12916_E-MTAB-12919/1-240419-CellRanger_ARC/work/74/6e12124514ab4196e97fe41faf16f7 Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run` Jul-10 10:19:31.289 [Task monitor] DEBUG nextflow.Session - Session aborted -- Cause: Process `seurat_clustering (seurat2020)` terminated with an error exit status (1) Jul-10 10:19:31.298 [Task monitor] DEBUG nextflow.Session - The following nodes are still active: [process] peaks_make_masterlist status=ACTIVE port 0: (value) bound ; channel: np_files port 1: (value) bound ; channel: f_chromlen port 2: (cntrl) - ; channel: $ [process] cells_masterlist_coverage status=ACTIVE port 0: (value) OPEN ; channel: masterbed_sps port 1: (value) bound ; channel: f_chromlen port 2: (queue) OPEN ; channel: celldef_list port 3: (cntrl) - ; channel: $ [process] make_master_peakmatrix status=ACTIVE port 0: (queue) OPEN ; channel: metafile port 1: (value) OPEN ; channel: masterpeak.bed port 2: (value) bound ; channel: cells.tab port 3: (cntrl) - ; channel: $ Jul-10 10:19:31.299 [main] DEBUG nextflow.Session - Session await > all processes finished Jul-10 10:19:31.299 [main] DEBUG nextflow.Session - Session await > all barriers passed Jul-10 10:19:31.299 [Actor Thread 64] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null Jul-10 10:19:31.302 [main] DEBUG nextflow.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=15; failedCount=1; ignoredCount=0; cachedCount=0; pendingCount=1; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=37m 20s; failedDuration=9.9s; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=9; peakCpus=9; peakMemory=0; ] Jul-10 10:19:31.302 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow completed -- saving trace file Jul-10 10:19:31.303 [main] DEBUG nextflow.trace.TimelineObserver - Workflow completed -- rendering execution timeline Jul-10 10:19:31.497 [main] DEBUG nextflow.cache.CacheDB - Closing CacheDB done Jul-10 10:19:31.503 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye ```
singlecell.csv (created based on the per_barcode_metrics.csv file made by cellranger-arc and containing the unfiltered list of barcodes): ``` barcode total duplicate chimeric unmapped lowmapq mitochondrial passed_filters cell_id is_cell_barcode DNase_sensitive_region_fragments enhancer_region_fragments promoter_region_fragments on_target_fragments blacklist_region_fragments excluded_reason TSS_fragments peak_region_fragments peak_region_cutsites AAACAAGCAAACAAAG-1 10 3 0 2 0 0 0 0 0 0 0 0 0 0 20 0 0 AAACAAGCAAACATGT-1 12 0 0 1 7 0 0 0 0 0 0 0 0 0 03 2 4 AAACAAGCAAACCCAA-1 225 28 1 3 40 0 0 0 0 0 0 0 0 0 040 38 68 AAACAAGCAAACCTAG-1 3 1 0 0 0 0 0 0 0 0 0 0 0 0 01 1 2 AAACAAGCAAACGCGT-1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 00 0 0 ... ```

chromlen.txt (created based on this comment in main.nf: ```nextflow params.chromlen = null // File with names of chromosomes. ```) ``` chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22 chrX chrY chrM ```
Mamba environment ``` name: nextflow-env channels: - bioconda - conda-forge dependencies: - _libgcc_mutex=0.1 - _openmp_mutex=4.5 - alsa-lib=1.2.11 - bzip2=1.0.8 - c-ares=1.28.1 - ca-certificates=2024.6.2 - cairo=1.18.0 - cni=1.0.1 - cni-plugins=1.3.0 - coreutils=8.25 - curl=7.88.1 - expat=2.6.2 - font-ttf-dejavu-sans-mono=2.37 - font-ttf-inconsolata=3.000 - font-ttf-source-code-pro=2.038 - font-ttf-ubuntu=0.83 - fontconfig=2.14.2 - fonts-conda-ecosystem=1 - fonts-conda-forge=1 - freetype=2.12.1 - giflib=5.2.2 - graphite2=1.3.13 - harfbuzz=8.5.0 - icu=73.2 - jq=1.7.1 - keyutils=1.6.1 - krb5=1.20.1 - lcms2=2.16 - lerc=4.0.0 - libarchive=3.5.2 - libcups=2.3.3 - libcurl=7.88.1 - libdeflate=1.20 - libedit=3.1.20191231 - libev=4.33 - libexpat=2.6.2 - libffi=3.4.2 - libgcc=7.2.0 - libgcc-ng=14.1.0 - libglib=2.80.2 - libgomp=14.1.0 - libiconv=1.17 - libjpeg-turbo=3.0.0 - libnghttp2=1.58.0 - libpng=1.6.43 - libseccomp=2.4.4 - libssh2=1.11.0 - libstdcxx-ng=14.1.0 - libtiff=4.6.0 - libuuid=2.38.1 - libwebp-base=1.4.0 - libxcb=1.16 - libxml2=2.12.7 - libzlib=1.3.1 - lz4-c=1.9.4 - lzo=2.10 - ncurses=6.5 - nextflow=22.10.6 - oniguruma=6.9.9 - openjdk=17.0.11 - openssl=3.3.1 - pcre2=10.44 - pixman=0.43.2 - pthread-stubs=0.4 - singularity=3.8.6 - squashfs-tools=4.6.1 - xorg-fixesproto=5.0 - xorg-inputproto=2.3.2 - xorg-kbproto=1.0.7 - xorg-libice=1.1.1 - xorg-libsm=1.2.4 - xorg-libx11=1.8.9 - xorg-libxau=1.0.11 - xorg-libxdmcp=1.1.3 - xorg-libxext=1.3.4 - xorg-libxfixes=5.0.3 - xorg-libxi=1.7.10 - xorg-libxrender=0.9.11 - xorg-libxt=1.3.0 - xorg-libxtst=1.2.3 - xorg-recordproto=1.14.2 - xorg-renderproto=0.11.1 - xorg-xextproto=7.3.0 - xorg-xproto=7.0.31 - xz=5.2.6 - zlib=1.3.1 - zstd=1.5.6 prefix: /home/ceger/miniforge3/envs/nextflow-env ```

Thank you for your time!

micans commented 2 weeks ago

Thank you for the excellent report. It has been quite a few years since I last touched this code and am no longer able to run it myself. However, one thing that might help is that chromlen.txt needs to have two columns (separated by <TAB>), the first being chromosome name, the second the length of the chromosome. Although the name is suggestive, the comment and lack of documentation could be improved. I don't know whether this explains the error you get, but it is worth a try.

egerc commented 2 weeks ago

Thank you so much for the quick response! I've added the chromosome length of the used assembly from here, but the same problem persists.

chromlen.txt

chr1    248956422
chr2    242193529
chr3    198295559
chr4    190214555
chr5    181538259
chr6    170805979
chr7    159345973
chr8    145138636
chr9    138394717
chr10   133797422
chr11   135086622
chr12   133275309
chr13   114364328
chr14   107043718
chr15   101991189
chr16   90338345
chr17   83257441
chr18   80373285
chr19   58617616
chr20   64444167
chr21   46709983
chr22   50818468
chrX    156040895
chrY    57227415
micans commented 2 weeks ago

By searching for the error message I landed on https://github.com/stuart-lab/signac/issues/111 Could you check whether there is any relevant suggestions there? I haven't updated or looked at this code in years, and the R part was incorporated by someone else - my apologies for not being more useful.

I do have a memory though of converting the singlecell.csv file from what was potentially cellranger-arc standard to the column names used by cellranger-atac (if the latter is what it was called). The column names did not match - I changed the newer column names to the older column names. If you have access to both type of files it's something you could check (I certainly cannot be sure this is the issue). I no longer have access to those scripts unfortunately (change of work place).

micans commented 2 weeks ago

The aforementioned rewrite is in fact part of the repository as bin/rewrite-metrics-file.R. This rewrites a multiome format ("10x multiomics per_barcode_metrics.csv") into the format that cellatac expects.

egerc commented 1 week ago

By searching for the error message I landed on stuart-lab/signac#111 Could you check whether there is any relevant suggestions there? I haven't updated or looked at this code in years, and the R part was incorporated by someone else - my apologies for not being more useful.

The change that fixed this issue was apparently implemented in version 0.2.5 of signac, while the version from the used docker image is 0.2.0. I tried replacing it with a conda environment but so far I couldn't make it work yet but once I do Im hopeful thatll fix it for good!

The aforementioned rewrite is in fact part of the repository as bin/rewrite-metrics-file.R. This rewrites a multiome format ("10x multiomics per_barcode_metrics.csv") into the format that cellatac expects.

Thanks for pointing that out to me, I missed that and made my own script for that but maybe i did it wrong.

Thank you for your help!

micans commented 1 week ago

I'm a bit rusty with all this. Ideally we'd update the docker recipe on quay.io (and update signac), but it would be good to first know for sure that this is what solves the issue. I'm a bit doubtful ... you could as a test run cellatac on regular 10x atac data; if that works it suggests that the issue may be to do with the data/metadata format rather than the signac version. If you can find the Nextflow work directory where the failure happens you could look at the files there to see if anything looks suspicious. You can do this e.g. by adding -with-trace reports/trace.txt to to the nextflow arguments, the trace file will tell you the prefix of the work directory where tasks are executed as well as the task exit status. If you pursue this and get stuck let me know.