COMBINE-lab / minnow

10 stars 2 forks source link

Simulated reads seem to not be compatible with Cell Ranger 6 #22

Open hoyu310 opened 3 years ago

hoyu310 commented 3 years ago

Hi, I tried running Cell Ranger 6 (latest version) on the simulated reads, but within two minutes before the alignment even began the following error message returned (the same run using Cell Ranger 3 completed successfully):

[error] Pipestance failed. Error log at: simulate_rl151/SC_RNA_COUNTER_CS/SC_MULTI_CORE/MULTI_GEM_WELL_PROCESSOR/COUNT_GEM_WELL_PROCESSOR/_BASIC_SC_RNA_COUNTER/_MATRIX_COMPUTER/MAKE_SHARD/fork0/chnk0-u31d28af566/_errors

Log message: stage failed unexpectedly: 'Invalid quality value 45 ASCII character 78 at position 0' execroot/home/git/checkouts/fastq_set-9504d5db1bfdb3fb/7cf23e7/src/squality.rs:19: 0: martian::martian_entry_point::{{closure}} at execroot/exec/lib/rust/execroot/home/git/checkouts/martian-rust-35615836cc90309f/8b43658/martian/src/lib.rs:184:25 1: std::panicking::rust_panic_with_hook at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:597:17 2: std::panicking::begin_panic_handler::{{closure}} at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:499:13 3: std::sys_common::backtrace::__rust_end_short_backtrace at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:141:18 4: rust_begin_unwind at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:495:5 5: std::panicking::begin_panic_fmt at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:437:5 6: ::validate_bytes at execroot/exec/lib/rust/execroot/home/git/checkouts/fastq_set-9504d5db1bfdb3fb/7cf23e7/execroot/home/git/checkouts/fastq_set-9504d5db1bfdb3fb/7cf23e7/src/squality.rs:19:17 fastq_set::array::ByteArray<N,T>::from_iter at execroot/exec/lib/rust/execroot/home/git/checkouts/fastq_set-9504d5db1bfdb3fb/7cf23e7/src/array.rs:71:9 fastq_set::array::ByteArray<N,T>::new at execroot/exec/lib/rust/execroot/home/git/checkouts/fastq_set-9504d5db1bfdb3fb/7cf23e7/src/array.rs:44:9

::raw_bc_construct_qual::{{closure}} at execroot/exec/lib/rust/cr_types/src/rna_read.rs:440:22 cr_types::barcode::BarcodeConstruct::map::{{closure}} at execroot/exec/lib/rust/cr_types/src/barcode.rs:336:32 7: cr_types::barcode::BarcodeConstruct::map_result at execroot/exec/lib/rust/cr_types/src/barcode.rs:320:43 cr_types::barcode::BarcodeConstruct::map at execroot/exec/lib/rust/cr_types/src/barcode.rs:336:9 ::raw_bc_construct_qual at execroot/exec/lib/rust/cr_types/src/rna_read.rs:439:9 8: ::visit_processed_read at execroot/exec/lib/rust/cr_lib/src/make_shard_metrics.rs:261:47 cr_lib::barcode_sort::BarcodeSortWorkflow::execute_workflow_with_visitor at execroot/exec/lib/rust/cr_lib/src/barcode_sort.rs:106:21 ::main at execroot/exec/lib/rust/cr_lib/src/make_shard.rs:241:9 9: ::main at execroot/exec/lib/rust/execroot/home/git/checkouts/martian-rust-35615836cc90309f/8b43658/martian/src/stage.rs:642:20 10: martian::martian_entry_point at execroot/exec/lib/rust/execroot/home/git/checkouts/martian-rust-35615836cc90309f/8b43658/martian/src/lib.rs:225:9 martian::MartianAdapter::run_get_error at execroot/exec/lib/rust/execroot/home/git/checkouts/martian-rust-35615836cc90309f/8b43658/martian/src/lib.rs:131:9 martian::MartianAdapter::run at execroot/exec/lib/rust/execroot/home/git/checkouts/martian-rust-35615836cc90309f/8b43658/martian/src/lib.rs:124:9 cr_lib::main at execroot/exec/lib/rust/cr_lib/src/bin/cr_lib.rs:111:23 11: core::ops::function::FnOnce::call_once at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ops/function.rs:227:5 std::sys_common::backtrace::__rust_begin_short_backtrace at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:125:18 12: main 13: __libc_start_main 14: at /mnt/home/adam.azarchs/crosstool/build/src/glibc-2.12.2/csu/../sysdeps/x86_64/elf/start.S:103
hiraksarkar commented 2 years ago

Sorry for getting back to you late, but I think we don't support cellranger 6, it has to do with CB length in the fastq format. That could be an issue.

hoyu310 commented 2 years ago

No problem, thanks for the reply.

ColeWunderlich commented 1 year ago

I think this is related to the sequence quality values that minnow is generating (using all Ns which corresponds to 78 in ascii)

Here's an example from a simulated R1.fastq

@AAAGAACAGACATGCGACGATGCTTTAA:GeneName:148:0:0
AAAGAACAGACATGCGACGATGCTTTAA
+
NNNNNNNNNNNNNNNNNNNNNNNNNNNN
@AAAGAACAGACATGCGGGTCCCGGGATG:GeneName:5:1:1
AAAGAACAGACATGCGGGTCCCGGGATG
+
NNNNNNNNNNNNNNNNNNNNNNNNNNNN

This corresponds to a quality value of 45 which is out of range for current illumina sequencers (illumina 1.8+, which tops out at J=41). (see here: https://en.m.wikipedia.org/wiki/FASTQ_format?useskin=vector#Encoding)

You can see that CR is complaining about quality scores from your error:

stage failed unexpectedly: 'Invalid quality value 45 ASCII character 78 at position 0

I think the solution would be to set them to something within the range of illumina 1.8 (say I or maybe H to be safe).