Closed ddiez closed 1 year ago
Hi @ddiez
Sorry, the workflow is giving you an incorrect error message.
I think the issue is because the sample_ids do no match between the single_cell_sample_sheet and those that are determined by the input data.
doing --fastq wf-single-cell-demo/fastq/A/chr17a.fq.gz
means the input from the A
folder will have a sample_id of chr17
.
But if you do --fastq wf-single-cell-demo/fastq/A
that should set the sample_id to 'A' the same as in your sample_sheet.
you might want to put your sample data in subdirectories with each one named with the sample_id
fastq
├── A
│ └── reads.fq
└── B
└── reads.fq
Thanks @nrhorner for the quick reply and sorry for not providing complete information. I had indeed placed the files in folders following your suggested structure:
$ls wf-single-cell-demo/fastq/*
wf-single-cell-demo/fastq/A:
chr17a.fq.gz
wf-single-cell-demo/fastq/B:
chr17b.fq.gz
Also just tried without success different iterations of changing the arguments in --fastq
:
# Original
--fastq wf-single-cell-demo/fastq/A/chr17a.fq.gz wf-single-cell-demo/fastq/B/chr17b.fq.gz
# Using same name for files as in your example (and renaming the files)
--fastq wf-single-cell-demo/fastq/A/chr17.fq.gz wf-single-cell-demo/fastq/B/chr17.fq.gz
# Passing just the folder with the subfolders
--fastq wf-single-cell-demo/fastq
# Passing the subfolders
--fastq wf-single-cell-demo/fastq/A wf-single-cell-demo/fastq/B
All these lead to the same error about the unsupported kit.
Hi @ddiez
I'll get a fix out for ASAP. In the meantime, could you just try running one sample at a time please and pass in the sample parameters on the command line --kit_name
--kit_version
and --expected_cells
. Sorry for the inconvinience.
Thanks @nrhorner! FYI, I had already run before an individual sample in the way you suggest, and although I had to fix some problem with the amount of memory available for the container, everything went fine. So, there is always that option.
Hi @ddiez that's good that you can at least run a single sample. I have a fix for the sample sheet issue and that will be released shortly/
Thanks for the update!
Hi @ddiez I just wanted to let you know that the we haven't forgot about this. I'm just waiting on one more thing before I can release the changes.
Hi @ddiez Sorry that this took so long, but there is a fix on our prerelease branch that should hopefully solve your sample sheet issues. It would be great if you're able to test it out.
nextflow run epi2me-labs/wf-single-cell -r prerelease ...
@nrhorner Thanks for the heads up. I checked with the example dataset set up as described above and it works. I will try with a real dataset soon although I imagine there won't be any problems. Thanks!
Thanks for getting back to me @ddiez. These changes will be released today in v0.3.0. I'll close this ticket now, but please let me know if you encounter any more issues.
Operating System
Other Linux (please specify below)
Other Linux
Ubuntu 23.04
Workflow Version
v0.2.7-g9272e2c
Workflow Execution
Command line
EPI2ME Version
No response
CLI command run
nextflow run epi2me-labs/wf-single-cell \ -w single-cell-demo-out2/workspace \ -profile standard \ --fastq wf-single-cell-demo/fastq/A/chr17a.fq.gz wf-single-cell-demo/fastq/B/chr17b.fq.gz \ --single_cell_sample_sheet samples.txt \ --ref_genome_dir ~/10x/refdata-gex/refdata-gex-GRCh38-2020-A \ --out_dir single-cell-demo-out2 \ --plot_umaps \ --umap_n_repeats 1
Workflow Execution - CLI Execution Profile
standard (default)
What happened?
I am trying to use single_cell_sample_sheet option to pass the information of three different samples. To test this, I am using the demo dataset and pretending I have to samples. I provide the sample information in a samples.txt file that contains the following information:
When I run the workflow using the code paste above, I get the following error:
The full log is included below. It seems the single_cell_sample_sheet file is correctly detected. For some reason the pipeline fails when checking the kit is one of the supported kits. Looking at the code in main.nf it is not clear to me why this might be. So I am wondering if I am using this option correctly or there is a problem in the workflow.
Relevant log output
Application activity log entry
No response