Open bioyuyang opened 5 years ago
Hi Yuyang,
I'll take a look at your error as soon as possible.
For an example .3dg
file, there's already a FTP link in README.md; but it's not showing up because GitHub doesn't support FTP links. Here I've pasted it below for your convenience:
ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM3271nnn/GSM3271352/suppl/GSM3271352_gm12878_06.impute3.round4.clean.3dg.txt.gz
The corresponding GEO accession contains final files like this for all single cells, as well as all intermediate files starting from raw.con.gz
. However, the two earliest files, aln.bam
and phased.seg.gz
haven't been provided because of their large size.
Your error seems to come from a discrepancy in chromosome naming between your genome file (chr1
in your hg19.fa
) and the SNP file you used (1
in snps/NA12878.txt.gz
). You must change one of them to match the other.
The importance of chromosome name matching has been mentioned in an earlier comment for this repo, and another comment for the companion repo hickit.
Ok. I will change the file and be careful in the following steps. Thanks so much for the quick respsonse or I will continue to get stuck.
发件人: Longzhi Tan notifications@github.com 发送时间: 2018年11月28日 22:36:52 收件人: tanlongzhi/dip-c 抄送: bioyuyang; Author 主题: Re: [tanlongzhi/dip-c] Error in dip-c seg and intermediate files (#21)
Your error seems to come from a discrepancy in chromosome naming between your genome file (chr1 in your hg19.fa) and the SNP file you used (1 in snps/NA12878.txt.gz). You must change one of them to match the other.
The importance of chromosome name matching has been mentioned in an earlier commenthttps://github.com/tanlongzhi/dip-c/issues/13#issuecomment-424156876 for this repo, and another commenthttps://github.com/tanlongzhi/dip-c/issues/13#issuecomment-424448646 for the companion repo hickit.
― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/tanlongzhi/dip-c/issues/21#issuecomment-442468474, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AlwYGLknFwmfTroxry8QLKkSCdXQ5mlUks5uzp-EgaJpZM4Y3luD.
Hi Tan,
I run the dip-c seg command and got the issues:
The messages is as following:
Traceback (most recent call last):
File "/THL8/home/liubin/software/dip-c-master/dip-c", line 130, in
Hi @liubinnk1, please see my reply to your identical question in the other thread. Best, Tan
Hi Tan,
This's Yuyang from Tsinghua Uni, Beijing. Hope you have a nice holiday.
I just followed the "Typical Workflow" in my server and got trouble at the very beginning step.
../seqtk-master/seqtk mergepe SRR7226685_1.fastq SRR7226685_2.fastq | ../lianti-master/lianti trim - |../bwa-master/bwa mem -Cp ../hg19.fa - | samtools view -uS |../sambamba-0.6.8-linux-static sort -o aln.bam /dev/stdin
./dip-c seg -v snps/NA12878.txt.gz aln.bam | gzip -c > phased.seg.gz
It throw an error in the second step.
By the way, could you mind uploading some key intermediate files? It would make the pipeline easy to follow and also for debugging. For example, in the "Interactive Visualization of 3D Genomes" section, cell.3dg is used in the whole section to make the pretty figures. Moreover, the 3D reconstruction process seems a little bit tricky as you also showed in the Fig. S8 in your Science paper. Do you have any suggestions to gain a reasonable simulated 3D structure?
Thanks so much for your help! Yuyang