Open snaqvi1990 opened 6 years ago
I am attempting this as well... I tried the following command:
velocyto run -U "file.bam" "file.gtf"
I get the error as follows which is weird because I am using the -U flag:
raise IOError("The bam file does not contain cell and umi barcodes appropriatelly formatted. If you are runnin UMI-less data you should use the -U flag.") OSError: The bam file does not contain cell and umi barcodes appropriatelly formatted. If you are runnin UMI-less data you should use the -U flag.
Any help would be appreciated
Follow up... I followed the advice of snower2010 from this post https://github.com/velocyto-team/velocyto.py/issues/107 and used simplesam https://simplesam.readthedocs.io/en/latest/#indices-and-tables to look at my bam file output from star (after sorting). I noticed that the bam files (of course) did not have a 'CB' or 'UB'. I used simplesam to add dummy CB and UB and now velocyto software appears to run without error. Hope this helps someone else
from simplesam import Reader, Writer in_file = open('in.bam', 'r') in_sam = Reader(in_file) x = next(in_sam) x.tags
with Reader(open('in.bam')) as in_bam: with Writer(open('out.sam', 'w'), in_bam.header) as out_sam: for read in in_bam:
read[umi_tag] = read.qname.split(":")[2] # add the umi tag
read[barcode_tag] = read.qname.split(":")[1] # add the barcode tag
out_sam.write(read)
Following on this, do you get the arrows on each sample like Figure 1h? I tried several ways to alter the scRNAseq pipeline without success (not an arrow on each sample), this is what I have tried (any suggestion is welcomed):
@z5ouyang I've also redraw the figure 1h, but the sample and arrow are not form a circle. Do you have any progress?
I think you should do a feature selection using genes that seem to vary across the time course. Then stop at step 3 of the analysis @z5ouyang is doing and then plot the pca of vlm.Sx_sz and draw arrows that point from vlm.Sx_sz to vlm.Sx_sz + k * self.delta_S, where k is a scaling constant to allow good visualization. Also check the fit of gammas, in this case with extremely few points is working well, you might want to use trivial regression with intercept because the quantile fit would basically pass a line through 2 points.
I think at the end, I followed this one: http://pklab.med.harvard.edu/velocyto/notebooks/R/chromaffin2.nb.html
Hi velocyto team,
I followed the replies above, but velocyto
reports an error in my file. I use three steps as following:
HG00096.1.M_111124_6.bam
file
samtools sort HG00096.1.M_111124_6.bam > sort_HG00096.1.M_111124_6.bam
import simplesam
barcode_tag = 'CB'
umi_tag = 'UB'
with simplesam.Reader(open("sort_HG00096.1.M_111124_6.bam")) as in_bam:
with simplesam.Writer(open("umi_sort_HG00096.1.M_111124_6.sam", 'w'), in_bam.header) as out_sam:
for read in in_bam:
read[umi_tag] = read.qname.split(":")[2] # add the umi tag
read[barcode_tag] = read.qname.split(":")[1] # add the barcode tag
out_sam.write(read)
# sam file to bam file
samtools view -S -b umi_sort_HG00096.1.M_111124_6.sam > umi_sort_HG00096.1.M_111124_6.bam
gtffile=gencode.v19.chr_patch_hapl_scaff.annotation.gtf
bamfile=umi_sort_HG00096.1.M_111124_6.bam
velocyto run ${bamfile} ${gtffile} -U -c
The error is
if unique and read.get_tag("NH") != 1:
File "pysam/libcalignedsegment.pyx", line 2395, in pysam.libcalignedsegment.AlignedSegment.get_tag
File "pysam/libcalignedsegment.pyx", line 2437, in pysam.libcalignedsegment.AlignedSegment.get_tag
KeyError: "tag 'NH' not present"
Could you tell me how to deal with the error? Many thanks
Best,
Sheng
Hi,
I solve the error by adding NH
information, following #198 . But there still some errors. I give the details in #237.
Thanks!
Best,
Sheng
Hello,
This is more of a question than an issue, but I would like to run velocyto on bulk, 100x100 RNA-seq. I have 6 samples currently, all with >30m reads. Is there any particular way you would suggest that this could be done? I tried run_smartseq2 on the samples, one .bam file at a time, but it crashed...
Thanks, Sahin