Closed aseyedia closed 2 months ago
Hi @aseyedia ,
Thanks for the issue - could you head your allele dataframe to see what it looks like?
Sure.
(base) [seyediana@reslnvvhpc041 04b-NumBat]$ zcat MCD1/MCD1_allele_counts.tsv.gz | head -n 50
cell snp_id CHROM POS cM REF ALT AD DP GT gene
TACCGAATCCGCTTAC-1 1_821000_T_A 1 821000 0.48995214890403 T A 0 1 0|1
TCTTCCTGTTACCCAA-1 1_821000_T_A 1 821000 0.48995214890403 T A 1 1 0|1
AGACACTCAGCCGTCA-1 1_823246_C_T 1 823246 0.48995214890403 C T 0 1 0|1
GAACACTTCAGACCGC-1 1_823246_C_T 1 823246 0.48995214890403 C T 1 1 0|1
GGATCTACACTTGAAC-1 1_823246_C_T 1 823246 0.48995214890403 C T 1 1 0|1
TACCGAATCCGCTTAC-1 1_823246_C_T 1 823246 0.48995214890403 C T 0 1 0|1
TATTGCTTCGAAATCC-1 1_823246_C_T 1 823246 0.48995214890403 C T 1 1 0|1
TTGACCCTCTACACTT-1 1_823246_C_T 1 823246 0.48995214890403 C T 1 1 0|1
AGGGCCTCAACAGTGG-1 1_946247_G_A 1 946247 0.620826529104498 G A 0 1 0|1 NOC2L
GAGTTGTCAACTGGTT-1 1_946247_G_A 1 946247 0.620826529104498 G A 1 1 0|1 NOC2L
GGTGAAGCAAACACCT-1 1_946247_G_A 1 946247 0.620826529104498 G A 0 1 0|1 NOC2L
GTGTGGCAGTACTGTC-1 1_946247_G_A 1 946247 0.620826529104498 G A 2 2 0|1 NOC2L
GCTACAATCCACCTCA-1 1_954724_G_A 1 954724 0.620826529104498 G A 1 1 1|0 NOC2L
GTCTGTCGTAGAGCTG-1 1_954724_G_A 1 954724 0.620826529104498 G A 0 1 1|0 NOC2L
CTCATTAGTCCCACGA-1 1_955679_C_T 1 955679 0.620826529104498 C T 0 1 0|1 NOC2L
GGAGGATCAAGGCCTC-1 1_955679_C_T 1 955679 0.620826529104498 C T 1 1 0|1 NOC2L
CACTGTCAGTGGCCTC-1 1_958339_G_A 1 958339 0.620826529104498 G A 1 1 0|1 NOC2L
CATACCCCAAGTATCC-1 1_958339_G_A 1 958339 0.620826529104498 G A 1 1 0|1 NOC2L
CTAACCCAGCTCGCAC-1 1_958339_G_A 1 958339 0.620826529104498 G A 1 1 0|1 NOC2L
CTGGACGTCATGGCCG-1 1_958339_G_A 1 958339 0.620826529104498 G A 1 1 0|1 NOC2L
GTTTACTCATTGAAGA-1 1_958339_G_A 1 958339 0.620826529104498 G A 1 1 0|1 NOC2L
TAATTCCGTGCTTCAA-1 1_958339_G_A 1 958339 0.620826529104498 G A 1 1 0|1 NOC2L
TCAGTCCTCAGTGGGA-1 1_958339_G_A 1 958339 0.620826529104498 G A 1 1 0|1 NOC2L
TTCATGTAGTTGTAAG-1 1_958339_G_A 1 958339 0.620826529104498 G A 1 1 0|1 NOC2L
TTCGGTCGTACGACAG-1 1_958339_G_A 1 958339 0.620826529104498 G A 0 1 0|1 NOC2L
AAACGCTTCTTCCCAG-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 0 1 1|0
AGGACTTAGTCACTCA-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 0 1 1|0
CATCCGTCAGCAGATG-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 0 1 1|0
CATTGCCAGTCTGGTT-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 0 1 1|0
CTCCCAAGTTAATCGC-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 1 1 1|0
CTGCGAGAGCACCGAA-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 1 1 1|0
CTTCGGTAGTCACAGG-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 0 1 1|0
GAGTTACGTGTCCATA-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 1 1 1|0
GCCTGTTCATGGAGAC-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 0 1 1|0
GTACAACCACGGTGAA-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 0 1 1|0
GTAGGAGGTCGGTGTC-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 0 1 1|0
TCACGGGTCCTATTGT-1 1_1000731_C_T 1 1000731 0.620826529104498 C T 0 1 1|0
GCCCAGAAGTCATAGA-1 1_1009184_T_C 1 1009184 0.620826529104498 T C 0 1 0|1
TGCTTGCTCGACCTAA-1 1_1009184_T_C 1 1009184 0.620826529104498 T C 1 1 0|1
AAAGGGCTCAGCCTCT-1 1_1014228_G_A 1 1014228 0.759933699602133 G A 1 1 0|1 ISG15
ACCATTTCAAAGCACG-1 1_1014228_G_A 1 1014228 0.759933699602133 G A 0 1 0|1 ISG15
AGAACAAAGCGTTCAT-1 1_1014228_G_A 1 1014228 0.759933699602133 G A 0 1 0|1 ISG15
AGAGCAGGTGATTGGG-1 1_1014228_G_A 1 1014228 0.759933699602133 G A 1 1 0|1 ISG15
AGGAGGTCAACGGTAG-1 1_1014228_G_A 1 1014228 0.759933699602133 G A 1 1 0|1 ISG15
AGGGTTTAGGTGCAGT-1 1_1014228_G_A 1 1014228 0.759933699602133 G A 1 1 0|1 ISG15
ATCACAGCATGAAGCG-1 1_1014228_G_A 1 1014228 0.759933699602133 G A 0 1 0|1 ISG15
ATCCTATGTGCCGAAA-1 1_1014228_G_A 1 1014228 0.759933699602133 G A 1 1 0|1 ISG15
ATCGTGAGTGCCTTCT-1 1_1014228_G_A 1 1014228 0.759933699602133 G A 1 1 0|1 ISG15
ATGGGAGAGGTCATAA-1 1_1014228_G_A 1 1014228 0.759933699602133 G A 1 1 0|1 ISG15
Seems right to me .. have you checked if you can run the ATC2 example in the interactive session & see what's different?
I ran into the same issue, if anyone else does it's an easy solve. Just add header=TRUE when you're reading in the alleles file, and you may need to un gzip the file also.
This should work:
alleles <- read.table(gzfile("/mnt/isilon/cscb/Projects/10x/xxx/04b-NumBat/MCD1/MCD1_allele_counts.tsv.gz"), sep="\t", header=TRUE)
Hello,
I am having an issue with the samples that were able to successfully work and produce allele counts with
pileup_and_phase.R
.I call the container using
singularity
like so:and I enter an interactive R session. Following the
Getting Started
guide, I try to supplyrun_numbat
with the gene x cell counts matrix and the allele counts, but regardless of the sample I use, I keep getting the same error:I wish I had more information to give you but the only other thing I can say is that some of these samples were unable to work with
pileup_and_phase.R
and produce the allele counts gz.Here are the bam and barcode files I used for each sample:
Do you have any advice on how to address this problem? The samples I'm asking about in this issue specifically are scRNA-seq samples processed by 10x CellRanger.