statgen / demuxlet

Genetic multiplexing of barcoded single cell RNA-seq
Apache License 2.0
117 stars 25 forks source link

Cannot find Droplet/Cell tag CB/UB warning messages #44

Closed asenabouth closed 5 years ago

asenabouth commented 5 years ago

Hello,

We have been noticing the following warning messages during the analysis of data generated by Cell Ranger 3.

The following parameters are available. Ones with "[]" are in effect:
   Options for input SAM/BAM/CRAM : --sam [/share/ScratchGeneral/annsen/data/experimental_data/CLEAN/POAG_scRNA/POAG_scRNA_V2/181108_MD_003/181108_MD_003/outs/possorted_genome_bam.bam],
                                    --tag-group [CB], --tag-UMI [UB]
        Options for input VCF/BCF : --vcf [/share/ScratchGeneral/annsen/data/experimental_data/CLEAN/POAG_scRNA/POAG_scRNA.vcf],
                                    --field [GP], --geno-error [0.01],
                                    --min-mac [1], --min-callrate [0.50], --sm,
                                    --sm-list [/share/ScratchGeneral/annsen/repositories/POAG_scRNA/Pools/181108_MD_003.tsv]
                   Output Options : --out [/share/ScratchGeneral/annsen/analysis/POAG_scRNA/POAG_scRNA_Demuxlet/181108_MD_003],
                                    --alpha [0.00, 0.50], --write-pair,
                                    --doublet-prior [0.50],
                                    --sam-verbose [1000000],
                                    --vcf-verbose [10000]
           Read filtering Options : --cap-BQ [40], --min-BQ [13],
                                    --min-MQ [20], --min-TD,
                                    --excl-flag [3844]
   Cell/droplet filtering options : --group-list [/share/ScratchGeneral/annsen/data/experimental_data/CLEAN/POAG_scRNA/POAG_scRNA_V2/181108_MD_003/181108_MD_003/outs/filtered_feature_bc_matrix/barcodes.tsv],
                                    --min-total, --min-uniq, --min-snp

Run with --help for more detailed help messages of each argument.

NOTICE [2019/05/24 17:19:52] - Finished loading 12569 droplet/cell barcodes to consider
NOTICE [2019/05/24 17:19:52] - Finished loading 7 IDs from /share/ScratchGeneral/annsen/repositories/POAG_scRNA/Pools/181108_MD_003.tsv
NOTICE [2019/05/24 17:19:52] - Finished identifying 7 samples to load from VCF/BCF
NOTICE [2019/05/24 17:19:55] - Reading 0 reads at 1:1 and skipping 0
NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 12274-th read A00152:78:HHNG5DSXX:4:2103:4155:7310 at 1:17475-17573. Treating all of them as a single group
NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 26628-th read A00152:78:HHNG5DSXX:4:1669:22733:7764 at 1:128604-128702. Treating all of them as a single group
NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 26651-th read A00152:78:HHNG5DSXX:1:2140:16242:34976 at 1:128667-128765. Treating all of them as a single group
NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 26656-th read A00152:78:HHNG5DSXX:2:1412:32208:27398 at 1:128679-128777. Treating all of them as a single group
NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 26996-th read A00152:78:HHNG5DSXX:2:2268:17779:12868 at 1:134158-134256. Treating all of them as a single group
NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 26997-th read A00152:78:HHNG5DSXX:2:2526:30228:26318 at 1:134158-134256. Treating all of them as a single group
NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 29536-th read A00152:78:HHNG5DSXX:4:2647:3405:22701 at 1:135153-135251. Treating all of them as a single group
NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 29577-th read A00152:78:HHNG5DSXX:4:2253:7473:12117 at 1:138510-138608. Treating all of them as a single group
NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 35070-th read A00152:78:HHNG5DSXX:2:1137:1271:5259 at 1:157470-692645. Treating all of them as a single group
NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 35117-th read A00152:78:HHNG5DSXX:1:2460:9408:17973 at 1:157515-692690. Treating all of them as a single group
NOTICE [2019/05/24 17:19:55] - WARNING: Suppressing 10+ missing Droplet/Cell tag warnings...

This is the first time we have seen such issues; we did not encounter them with data generated by Cell Ranger 2. It is of great concern to us as the results have less cells.

This version of demuxlet was installed via anaconda. The conda environment is as follows:

# packages in environment at /share/ClusterShare/software/contrib/annsen/anaconda3/envs/demuxlet:
#
# Name                    Version                   Build  Channel
bzip2                     1.0.6             h14c3975_1002    conda-forge
ca-certificates           2019.3.9             hecc5488_0    conda-forge
curl                      7.64.0               h646f8bb_0    conda-forge
demuxlet                  1.0                  h7279bd8_1    bioconda
htslib                    1.9                  ha228f0b_7    bioconda
krb5                      1.16.3            hc83ff2d_1000    conda-forge
libcurl                   7.64.0               h01ee5af_0    conda-forge
libdeflate                1.0                  h14c3975_1    bioconda
libedit                   3.1.20170329      hf8c457e_1001    conda-forge
libgcc-ng                 8.2.0                hdf63c60_1    anaconda
libssh2                   1.8.0             h1ad7b7a_1003    conda-forge
libstdcxx-ng              8.2.0                hdf63c60_1    anaconda
libtool                   2.4.6             h14c3975_1002    conda-forge
ncurses                   6.1               hf484d3e_1002    conda-forge
openssl                   1.0.2r               h14c3975_0    conda-forge
samtools                  1.9                  h43f6869_9    bioconda
tk                        8.6.9             h84994c4_1001    conda-forge
xz                        5.2.4             h14c3975_1001    conda-forge
zlib                      1.2.11            h14c3975_1004    conda-forge
hyunminkang commented 5 years ago

This is not necessary a concern. Cell Ranger sometimes do not assign cell barcodes when it is not certain.

Hyun.

Hyun Min Kang, Ph.D. Associate Professor of Biostatistics University of Michigan, Ann Arbor Email : hmkang@umich.edu

On Tue, May 28, 2019 at 2:22 AM Anne Senabouth notifications@github.com wrote:

Hello,

We have been noticing the following warning messages during the analysis of data generated by Cell Ranger 3.

The following parameters are available. Ones with "[]" are in effect: Options for input SAM/BAM/CRAM : --sam [/share/ScratchGeneral/annsen/data/experimental_data/CLEAN/POAG_scRNA/POAG_scRNA_V2/181108_MD_003/181108_MD_003/outs/possorted_genome_bam.bam], --tag-group [CB], --tag-UMI [UB] Options for input VCF/BCF : --vcf [/share/ScratchGeneral/annsen/data/experimental_data/CLEAN/POAG_scRNA/POAG_scRNA.vcf], --field [GP], --geno-error [0.01], --min-mac [1], --min-callrate [0.50], --sm, --sm-list [/share/ScratchGeneral/annsen/repositories/POAG_scRNA/Pools/181108_MD_003.tsv] Output Options : --out [/share/ScratchGeneral/annsen/analysis/POAG_scRNA/POAG_scRNA_Demuxlet/181108_MD_003], --alpha [0.00, 0.50], --write-pair, --doublet-prior [0.50], --sam-verbose [1000000], --vcf-verbose [10000] Read filtering Options : --cap-BQ [40], --min-BQ [13], --min-MQ [20], --min-TD, --excl-flag [3844] Cell/droplet filtering options : --group-list [/share/ScratchGeneral/annsen/data/experimental_data/CLEAN/POAG_scRNA/POAG_scRNA_V2/181108_MD_003/181108_MD_003/outs/filtered_feature_bc_matrix/barcodes.tsv], --min-total, --min-uniq, --min-snp

Run with --help for more detailed help messages of each argument.

NOTICE [2019/05/24 17:19:52] - Finished loading 12569 droplet/cell barcodes to consider NOTICE [2019/05/24 17:19:52] - Finished loading 7 IDs from /share/ScratchGeneral/annsen/repositories/POAG_scRNA/Pools/181108_MD_003.tsv NOTICE [2019/05/24 17:19:52] - Finished identifying 7 samples to load from VCF/BCF NOTICE [2019/05/24 17:19:55] - Reading 0 reads at 1:1 and skipping 0 NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 12274-th read A00152:78:HHNG5DSXX:4:2103:4155:7310 at 1:17475-17573. Treating all of them as a single group NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 26628-th read A00152:78:HHNG5DSXX:4:1669:22733:7764 at 1:128604-128702. Treating all of them as a single group NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 26651-th read A00152:78:HHNG5DSXX:1:2140:16242:34976 at 1:128667-128765. Treating all of them as a single group NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 26656-th read A00152:78:HHNG5DSXX:2:1412:32208:27398 at 1:128679-128777. Treating all of them as a single group NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 26996-th read A00152:78:HHNG5DSXX:2:2268:17779:12868 at 1:134158-134256. Treating all of them as a single group NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 26997-th read A00152:78:HHNG5DSXX:2:2526:30228:26318 at 1:134158-134256. Treating all of them as a single group NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 29536-th read A00152:78:HHNG5DSXX:4:2647:3405:22701 at 1:135153-135251. Treating all of them as a single group NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 29577-th read A00152:78:HHNG5DSXX:4:2253:7473:12117 at 1:138510-138608. Treating all of them as a single group NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 35070-th read A00152:78:HHNG5DSXX:2:1137:1271:5259 at 1:157470-692645. Treating all of them as a single group NOTICE [2019/05/24 17:19:55] - WARNING: Cannot find Droplet/Cell tag CB from 35117-th read A00152:78:HHNG5DSXX:1:2460:9408:17973 at 1:157515-692690. Treating all of them as a single group NOTICE [2019/05/24 17:19:55] - WARNING: Suppressing 10+ missing Droplet/Cell tag warnings...

This is the first time we have seen such issues; we did not encounter them with data generated by Cell Ranger 2. It is of great concern to us as the results have less cells.

This version of demuxlet was installed via anaconda. The conda environment is as follows:

packages in environment at /share/ClusterShare/software/contrib/annsen/anaconda3/envs/demuxlet:

#

Name Version Build Channel

bzip2 1.0.6 h14c3975_1002 conda-forge ca-certificates 2019.3.9 hecc5488_0 conda-forge curl 7.64.0 h646f8bb_0 conda-forge demuxlet 1.0 h7279bd8_1 bioconda htslib 1.9 ha228f0b_7 bioconda krb5 1.16.3 hc83ff2d_1000 conda-forge libcurl 7.64.0 h01ee5af_0 conda-forge libdeflate 1.0 h14c3975_1 bioconda libedit 3.1.20170329 hf8c457e_1001 conda-forge libgcc-ng 8.2.0 hdf63c60_1 anaconda libssh2 1.8.0 h1ad7b7a_1003 conda-forge libstdcxx-ng 8.2.0 hdf63c60_1 anaconda libtool 2.4.6 h14c3975_1002 conda-forge ncurses 6.1 hf484d3e_1002 conda-forge openssl 1.0.2r h14c3975_0 conda-forge samtools 1.9 h43f6869_9 bioconda tk 8.6.9 h84994c4_1001 conda-forge xz 5.2.4 h14c3975_1001 conda-forge zlib 1.2.11 h14c3975_1004 conda-forge

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/statgen/demuxlet/issues/44?email_source=notifications&email_token=ABPY5ONBA56SW4S3IUXV5GTPXTFRJA5CNFSM4HQAC4N2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GWEPFVA, or mute the thread https://github.com/notifications/unsubscribe-auth/ABPY5OOMIZ2BSII2MRLPANDPXTFRJANCNFSM4HQAC4NQ .

asenabouth commented 5 years ago

Hi Hyun,

That makes sense. We are still losing cells in some experiments, but not for others. Will reopen this if needed.

Thanks, Anne