schneebergerlab / AMPRIL-genomes

scripts for the project of seven thaliana genomes assembly
34 stars 18 forks source link

about pan-genome analysis #1

Open sunnycqcn opened 4 years ago

sunnycqcn commented 4 years ago

Hello, Much appreciated for your excellent work. I try to run your code in my project with pangenome command. I met a error as following: Starting ====== 2020-02-27 12:40:48 ====== Traceback (most recent call last): File "/isilon/saskatoon-rdc/users/fuf/comDIR/AMPRIL-genomes/pangenome/wga.pangenome.py", line 236, in main(sys.argv[1:]) File "/isilon/saskatoon-rdc/users/fuf/comDIR/AMPRIL-genomes/pangenome/wga.pangenome.py", line 52, in main result = runJob(arr, num, chrBeds, accs, wgadir, outdir) File "/isilon/saskatoon-rdc/users/fuf/comDIR/AMPRIL-genomes/pangenome/wga.pangenome.py", line 91, in runJob t = getLen(chrBeds[accs[j]]) File "/isilon/saskatoon-rdc/users/fuf/comDIR/AMPRIL-genomes/pangenome/wga.pangenome.py", line 217, in getLen fi = open(inFile,"r") IOError: [Errno 2] No such file or directory: './chrsize/C24.leng.txt' ########################################################################### So can I use your code in my project? Because I check the command and find you have difined the accs as following: accs = ["An-1","C24","Col","Cvi","Eri","Kyo","Ler","Sha"]. I think this is only for your project. Thanks, Fuyou

wen-biao commented 4 years ago

Hi Fuyou,

the error message shows you need a genome length file for each genome, this tab-formated file like below chr1 25332333 chr2 192012013 ... chrN 29139132

the script should be OK for other pan-genome project after changing some lines accordingly.

sunnycqcn commented 4 years ago

Hello Wen-Biao, I got the tab-formated file and changed the accs to my name. The code works well. But I still have a little question about the files from "all pairwise whole genome comparisons using MUMmer". When I run the changed command wga.pangenome.py, I get error as following: cat: ./genome/L002/L003/L003.del.bed: No such file or directory cat: ./genome/L003/L002/L002.del.bed: No such file or directory cat: ./genome/L002/L003/L003.wga.block.txt: No such file or directory cat: ./genome/L003/L002/L002.wga.block.txt: No such file or directory These pairwise files are generated by command "syri -c out.chrom.coords -d out_m_i90_l100.delta -r Col.fasta -q An-1.fasta --nc 5 --all -k". Or we have to use mummer to generate them. If by ourself, could you tell me the format? Thanks, Fuyou

liufy11 commented 3 years ago

I run the pangenome script smoothly. in my opinion , you should create the above files by yourself. look at README.md under AMPRIL-genomes/pangenome