Xieeeee / Droplet-Hi-C

🍹This repo includes the scripts and analysis notebook for the Droplet Hi-C manuscript
4 stars 1 forks source link

May I know the format of your barcode.index file? #2

Open Xuyuch opened 3 hours ago

Xuyuch commented 3 hours ago

Hi Yang, I was testing the codes and found error when I do convert_hic2. And I do believe the problem comes from this part:

(zcat ${fastq_dir}/${s}_R1_combined.fq.gz | bowtie -x ${ref_10X} - --n
[737K-arc-v1.fa.txt](https://github.com/user-attachments/files/17532200/737K-arc-v1.fa.txt)
ofw -m 1 -v 1 -S ${fastq_dir}/${s}_R1_BC.sam) 2>${fastq_dir}/${s}_R1.log
zcat ${fastq_dir}/${s}_R3_combined.fq.gz | bowtie -x ${ref_10X} - --nofw -m 1 -v 1 -S ${fastq_dir}/${s}_R3_BC.sam

Here, I think the barcode file was changed from txt file to a fa file and use bowtie to index the file. So I tried manually name those barcode as barcode_# and create a 737K-arc-v1.fa (enclosed here737K-arc-v1.fa.txt). However, even though the bowtie gives a good result, the convert_hic2 get a Segmentation fault (core dumped)

# reads processed: 501939313
# reads with at least one alignment: 495805480 (98.78%)
# reads that failed to align: 6133833 (1.22%)
# reads with alignments suppressed due to -m: 4054359 (0.81%)
Reported 491751121 alignments

Could you please help me solve this issue? Or could you share the 737k-arc -v1 bowtie index you used in your pipeline? Thank you!

Best, Yuchen

Xieeeee commented 2 hours ago

Hi Yuchen, Can you show me the first few lines of the index? The format of index looks like this:

>AAACGAAAGAAACGCC
AAACGAAAGAAACGCC
>AAACGAAAGAAAGCAG
AAACGAAAGAAAGCAG
Xuyuch commented 2 hours ago

Oh I know the reason now. In my version of index, it looks like this

>barcode_1
ACAGCGGGTGTGTTAC
>barcode_2
ACAGCGGGTTGTTCTT
>barcode_3
ACAGCGGGTAACAGGC
>barcode_4
ACAGCGGGTGCGCGAA
>barcode_5
ACAGCGGGTCCTCCAT
>barcode_6
ACAGCGGGTCATGGTT

I will modified to your version and test it. Thank you! Yuchen