Open j-chou opened 9 years ago
I think I've figured out the issue: I didn't have taxa_dict in my actual data folder (the one not in scratch). Sorry for the trouble! I'll try running everything again.
Oh! I kept trying to figure out the reason. Good.
Sent from my iPhone
On 25-Apr-2015, at 10:55 am, j-chou notifications@github.com wrote:
I think I've figured out the issue: I didn't have taxa_dict in my actual data folder (the one not in scratch). Sorry for the trouble! I'll try running everything again.
— Reply to this email directly or view it on GitHub.
Hey guys, so I've converted the fasta files in the 15-taxon dataset using a script Ruth pointed out to me online. I'm now trying to sample and relabel the alignments, but I get the following errors when I run the script run_pipeline_cluster.sh:
Traceback (most recent call last): File "src-pipeline/taxon_relabeler.py", line 72, in
processFilesTaxon(inp_folder,out_folder,dict_file)
File "src-pipeline/taxon_relabeler.py", line 57, in processFilesTaxon
new_dna_string+=new_taxon+'\t'+str(dna[old_taxon])+'\n'
File "/usr/local/python/2.7.8/lib/python2.7/site-packages/dendropy/dataobject/char.py", line 1168, in getitem
raise KeyError(label)
KeyError: '0'
cp: cannot stat `/home/jedchou1/scratch/AGBsvdquartets/data/sim1/S_relabeled_tree.trees': No such file or directory
The format of the 15-taxon dataset is as follows: On the cluster, I have a folder scratch/AGBsvdquartets/data. Inside this data folder are 10 folders called sim1,sim2,...,sim10 as well as a file called taxa_dict.txt which just has the taxa names A,B,C,...,O and their corresponding numbers 1,2,3...,15 separated by a tab character. Inside each sim folder are 1000 fasta alignments (1.fasta,2.fasta,...), 1000 phylip alignments (1.phy,2.phy,...), and a file called s_tree.trees which contains the true species tree in newick format.
Do you guys know what's causing this error?