davidemms / OrthoFinder

Phylogenetic orthology inference for comparative genomics
https://davidemms.github.io/
GNU General Public License v3.0
648 stars 185 forks source link

Error during Running Orthologue Prediction, Step 5, on ExampleDataset #36

Closed sujaikumar closed 7 years ago

sujaikumar commented 7 years ago

Hi David

I tried running orthofinder (checked out today at 5pm), with the new functionality, and am getting the errors below even with your Tests/Input/ExampleDataset of 4 Mycoplasma protein files.

Any thoughts on what could be causing the errors? The program gives errors during Running Orthologue Prediction, Step 5, but then seems to finish correctly.

However, when I then look at Species-by-species orthologues: .../Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/Orthologues/Orthologues_Mycoplasma_/ - the files there just have headers and nothing else in them.

From the error messages below, it looks like it is trying to create an output directory which is misnamed: eg

/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000310_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000310_tree_id.coal.tree

(which has the /scratch/skumar/.... bit in it twice)

Thanks,

Sujai

ps. A dump of the full ExampleDataset folder as run below is at: ftp://ftp.ed.ac.uk/edupload/ExampleDataset.tar.gz


orthofinder.py -f /scratch/skumar/Tests/Input/ExampleDataset -t 16

OrthoFinder version 1.0.5 Copyright (C) 2014 David Emms

    This program comes with ABSOLUTELY NO WARRANTY.
    This is free software, and you are welcome to redistribute it under certain conditions.
    For details please see the License.md that came with this software.

16 thread(s) for highly parallel tasks (BLAST searches etc.)
1 thread(s) for OrthoFinder algorithm

1. Checking required programs are installed
-------------------------------------------
Test can run "makeblastdb -help" - ok
Test can run "blastp -help" - ok
Test can run "mcl -h" - ok

2. Temporarily renaming sequences with unique, simple identifiers
------------------------------------------------------------------

3. Dividing up work for BLAST for parallel processing
-----------------------------------------------------
2016-09-26 18:30:33 : Creating Blast database 1 of 4
2016-09-26 18:30:33 : Creating Blast database 2 of 4
2016-09-26 18:30:33 : Creating Blast database 3 of 4
2016-09-26 18:30:33 : Creating Blast database 4 of 4

4. Running BLAST all-versus-all
-------------------------------
Using 16 thread(s)
2016-09-26 18:30:33 : This may take some time....
2016-09-26 18:30:33 : Done 0 of 16

5. Running OrthoFinder algorithm
--------------------------------
2016-09-26 18:30:53 : Initial processing of each species
2016-09-26 18:30:53 : Initial processing of species 0 complete
2016-09-26 18:30:54 : Initial processing of species 1 complete
2016-09-26 18:30:54 : Initial processing of species 2 complete
2016-09-26 18:30:54 : Initial processing of species 3 complete
2016-09-26 18:30:59 : Connected putatitive homologs
2016-09-26 18:30:59 : Writen final scores for species 0 to graph file
2016-09-26 18:30:59 : Writen final scores for species 1 to graph file
2016-09-26 18:30:59 : Writen final scores for species 2 to graph file
2016-09-26 18:30:59 : Writen final scores for species 3 to graph file
2016-09-26 18:31:00 : Ran MCL

6. Writing orthogroups to file
------------------------------
A duplicate accession was found using just first part: A1
Tried to use only the first part of the accession in order to list the sequences in each orthogroup
more concisely but these were not unique. The full accession line will be used instead.

Orthogroups have been written to tab-delimited files:
   /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthogroups.csv
   /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthogroups.txt (OrthoMCL format)
   /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthogroups_UnassignedGenes.csv

Running Orthologue Prediction
=============================

1. Checking required programs are installed
-------------------------------------------
Test can run "fastme -i /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/WorkingDirectory/SimpleTest.phy -o /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/WorkingDirectory/SimpleTest.tre" - ok
Test can run "dlcpar_search --version" - ok

2. Calculating gene distances
-----------------------------
2016-09-26 18:31:00 : Done 0 of 16
2016-09-26 18:31:02 : Processing species 0
2016-09-26 18:31:02 : Processing species 1
2016-09-26 18:31:02 : Processing species 2
2016-09-26 18:31:03 : Processing species 3

3. Inferring gene and species trees
-----------------------------------
2016-09-26 18:31:03 : Done 0 of 315
2016-09-26 18:31:03 : Done 100 of 315
2016-09-26 18:31:03 : Done 200 of 315
A duplicate accession was found using just first part: A1
Tried to use only the first part of the accession in order to list the sequences in each orthogroup
more concisely but these were not unique. The full accession line will be used instead.

4. Best outgroup(s) for species tree
------------------------------------
Observed 3 duplications. 3 support the best root and 0 contradict it.
Best outgroup for species tree:
  Mycoplasma_agalactiae_5632_FP671138, Mycoplasma_hyopneumoniae_AE017243

5. Reconciling gene and species trees
-------------------------------------
Outgroup: Mycoplasma_agalactiae_5632_FP671138, Mycoplasma_hyopneumoniae_AE017243
2016-09-26 18:31:07 : Done 0 of 314
Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon
    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
Traceback (most recent call last):
    rootData=True)
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon
    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
    rootData=True)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    stream = open(filename, mode)
IOError: [Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000014_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000014_tree_id.coal.tree'
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    stream = open(filename, mode)
IOError: [Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000008_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000008_tree_id.coal.tree'
Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon
    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
    rootData=True)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    stream = open(filename, mode)
IOError:     phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
[Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000006_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000006_tree_id.coal.tree'  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon

    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
    rootData=True)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    stream = open(filename, mode)
IOError: [Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000011_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000011_tree_id.coal.tree'
Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon
    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
    rootData=True)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    stream = open(filename, mode)
IOError: [Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000015_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000015_tree_id.coal.tree'
Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon
    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
    rootData=True)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    stream = open(filename, mode)
IOError: [Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000004_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000004_tree_id.coal.tree'
Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon
    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
    rootData=True)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    stream = open(filename, mode)
IOError: [Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000013_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000013_tree_id.coal.tree'
Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon
    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
    rootData=True)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    stream = open(filename, mode)
IOError: [Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000009_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000009_tree_id.coal.tree'
Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon
    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
    rootData=True)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    stream = open(filename, mode)
IOError: [Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000007_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000007_tree_id.coal.tree'
Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon
    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
    rootData=True)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    stream = open(filename, mode)
IOError: [Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000003_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000003_tree_id.coal.tree'
Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 201, in main
    phyloDLC.write_dlcoal_recon(out, coal_tree, maxrecon)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 339, in write_dlcoal_recon
    recon.write(filename, coal_tree, exts=exts, filenames=filenames)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/yjw/bio/phyloDLC.py", line 169, in write
    rootData=True)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/treelib.py", line 602, in write_newick
    write_newick(self, util.open_stream(out, "w"),
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    stream = open(filename, mode)
IOError: [Errno 20] Not a directory: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000010_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000010_tree_id.coal.tree'

(error messages continue for each Orthogroup)

6. Inferring orthologues from gene trees
----------------------------------------
2016-09-26 18:31:28 : Processing orthologues for species 0
2016-09-26 18:31:28 : Processing orthologues for species 1
2016-09-26 18:31:28 : Processing orthologues for species 2
2016-09-26 18:31:28 : Processing orthologues for species 3

7. Writing results files
------------------------
Orthogroups have been written to tab-delimited files:
   /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthogroups.csv
   /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthogroups.txt (OrthoMCL format)
   /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthogroups_UnassignedGenes.csv

Gene trees:
   /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Gene_Trees

Rooted species tree:
   /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/SpeciesTree_rooted.txt

Species-by-species orthologues:
   /scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26

Orthogroup statistics:
   Statistics_PerSpecies.csv   Statistics_Overall.csv   Orthogroups_SpeciesOverlaps.csv

OrthoFinder assigned 1938 genes (70.9% of total) to 536 orthogroups. Fifty percent of all genes were in orthogroups
with 4 or more genes (G50 was 4) and were contained in the largest 300 orthogroups (O50 was 300). There were 280
orthogroups with all species present and 253 of these consisted entirely of single-copy genes.

When publishing work that uses OrthoFinder please cite:
    D.M. Emms & S. Kelly (2015), OrthoFinder: solving fundamental biases in whole genome comparisons
    dramatically improves orthogroup inference accuracy, Genome Biology 16:157.
davidemms commented 7 years ago

Hi Sujai

There's a line in the file orthofinder/scripts/get_orthologues.py which says,

# print(dlcCommands[0])

Could you uncomment this line and post the line of text that it prints out. It'll be somewhere in the output and should start, "dlcpar_search -s".

Thanks David

sujaikumar commented 7 years ago
print(dlcCommands[0])

resulted in:

dlcpar_search -s /scratch/skumar/maker_test/ExampleDataset/Results_Sep27/Orthologues_Sep27/WorkingDirectory/Trees_ids/SpeciesTree_ids_0_rooted.txt -S /scratch/skumar/maker_test/ExampleDataset/Results_Sep27/Orthologues_Sep27/WorkingDirectory/Trees_ids_arbitraryRoot/GeneMap.smap -D 1 -C 0.125 /scratch/skumar/maker_test/ExampleDataset/Results_Sep27/Orthologues_Sep27/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000000_tree_id.txt -O /scratch/skumar/maker_test/ExampleDataset/Results_Sep27/Orthologues_Sep27/WorkingDirectory/dlcpar/OG0000000_tree_id

On running this (from the same location as where I ran orthofinder.py from, with the same env:

Traceback (most recent call last):
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 209, in <module>
    sys.exit(main())
  File "/exports/virt_env/python/orthofinder/bin/dlcpar_search", line 137, in main
    gene2species = phylo.read_gene2species(options.smap)
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/compbio/phylo.py", line 94, in read_gene2species
    util.open_stream(filename))))
  File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream
    stream = open(filename, mode)
IOError: [Errno 2] No such file or directory: '/scratch/skumar/maker_test/ExampleDataset/Results_Sep27/Orthologues_Sep27/WorkingDirectory/Trees_ids_arbitraryRoot/GeneMap.smap'

I checked, and only this file exists:

/scratch/skumar/maker_test/ExampleDataset/Results_Sep27/Orthologues_Sep27/WorkingDirectory/Trees_ids/SpeciesTree_ids_0_rooted.txt

and none of these files exist:

/scratch/skumar/maker_test/ExampleDataset/Results_Sep27/Orthologues_Sep27/WorkingDirectory/Trees_ids_arbitraryRoot/GeneMap.smap
/scratch/skumar/maker_test/ExampleDataset/Results_Sep27/Orthologues_Sep27/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000000_tree_id.txt
/scratch/skumar/maker_test/ExampleDataset/Results_Sep27/Orthologues_Sep27/WorkingDirectory/dlcpar/OG0000000_tree_id
davidemms commented 7 years ago

Yes, it's a good idea to try running the command. The command line looks exactly right so I don't think the mangled path is coming from OrthoFinder: '/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/Trees_ids_arbitraryRoot/OG0000010_tree_id.txt/scratch/skumar/Tests/Input/ExampleDataset/Results_Sep26/Orthologues_Sep26/WorkingDirectory/dlcpar/OG0000010_tree_id.coal.tree'

Those files you were looking for are deleted by OrthoFinder at the end of a run so as to clean up after itself.

I think the problem might be that you have two conflicting parts of dlcpar from two different versions. I've checked the line in dlcpar that is generating the error above:

File "/exports/virt_env/python/orthofinder/lib/python2.7/site-packages/dlcpar/deps/rasmus/util.py", line 1170, in open_stream stream = open(filename, mode)

and this version of the util.py file comes from the 0.9.7 version that is on github whereas I believe you're now using the 0.9.1 version of dlcpar_search that is available from the webpage (http://compbio.mit.edu/dlcpar/). I think that when you installed the 0.9.1 version it didn't delete or overwrite the 0.9.7 version and this is causing the conflict! Can you remove them and install 0.9.1 again from a clean state? It's possible that the issue with 0.9.7 was also an installation problem, I'll check whether that version works for me.

davidemms commented 7 years ago

Note, stream = open(filename, mode) is on line 1170 in dlcpar version 0.9.7 (and in the error message above) whereas it is on line 1209 in version 0.9.1. If I give dlcpar a non-existent output path then the error gets thrown from line 1209, hence why I think there might be an dlcpar installation problem.

sujaikumar commented 7 years ago

Thanks very much for the detective work. Yes, that's exactly what the problem was. For future reference, if anyone wants to uninstall dlcpar (or another python package) COMPLETELY, you need to do this:

I had uninstalled dlcpar using

pip uninstall dlcpar

But got an error msg saying

DEPRECATION: Uninstalling a distutils installed project (dlcpar) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.

which I had ignored.

This time, I uninstalled everything using:

python setup.py install --record files.txt

And then deleting all the files/directories in files.txt using:

parallel rm -rf :::: files.txt # gnu parallel is wonderful for things like this

Then I reinstalled it:

wget http://compbio.mit.edu/dlcpar/pub/sw/dlcpar-0.9.1.tar.gz
tar xzf dlcpar-0.9.1.tar.gz
python setup.py install

And orthofinder worked perfectly, generating all the genetrees!

Thanks again!