hoffmangroup / genomedata

The Genomedata format for storing large-scale functional genomics data.
https://genomedata.hoffmanlab.org/
GNU General Public License v2.0
2 stars 1 forks source link

Bad error message when genomedata-load-data not found on PATH #49

Open EricR86 opened 5 years ago

EricR86 commented 5 years ago

Original report (archived issue) by Coby Viner (Bitbucket: cviner2, GitHub: cviner).


On some environments, it is possible to perform a valid pip install --user genomedata, without $HOME/.local/bin being included in the user's PATH. While the user needs to ensure it is added in some fashion, often via their .bashrc, its absence can lead to the following cryptic error. This occurs upon invocation of genomedata.load_genomedata.load_genomedata, when the genomedata-load-data binary is not in the PATH. In version 1.3.6 on Graham:

#!text
>> <cytomod.py> 2019-01-02T15:42:09.236766 Using the following
                                           ambiguity map: {}.
>> <cytomod.py> 2019-01-02T15:42:10.203963 Creating genomedata archive
                                           in ../data//archive/.
>> Using temporary Genomedata archive: /tmp/genomedata.0w5maf
>> 2019-01-02T15:42:10.208567: Loading sequence files:
../data/mm9_chrY.fa.gz
>> 2019-01-02T15:44:34.820816: Opening Genomedata archive with 6 tracks
>> 2019-01-02T15:44:34.826996: Loading data
>> Loading data for track: mm9_chrY-only_5mC-fakeData.bedGraph.gz
zcat ../data/mm9_chrY-only_5mC-fakeData.bedGraph.gz | genomedata-load-data -v /tmp/genomedata.0w5maf mm9_chrY-only_5mC-fakeData.bedGraph.gz
Error creating genomedata.
>> Cleaning up... done
Traceback (most recent call last):
  File "../../src/cytomod.py", line 714, in <module>
    seqfilenames=FASTA_file_list, verbose=args.verbose)
  File "/home/cviner2/.local/lib/python2.7/site-packages/genomedata/load_genomedata.py", line 125, in load_genomedata
    verbose=verbose)
  File "/home/cviner2/.local/lib/python2.7/site-packages/genomedata/_load_data.py", line 79, in load_data
    loader = Popen(load_cmd, stdin=reader.stdout)
  File "/cvmfs/soft.computecanada.ca/nix/store/4x0hqnpd0hfh62m3apkxmz8hz3hlsikx-python-2.7.13-env/lib/python2.7/subprocess.py", line 390, in __init__
    errread, errwrite)
  File "/cvmfs/soft.computecanada.ca/nix/store/4x0hqnpd0hfh62m3apkxmz8hz3hlsikx-python-2.7.13-env/lib/python2.7/subprocess.py", line 1024, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Note that the only traceback seen by the user is an OSError with no target, as opposed to some informative indication that genomedata-load-data cannot be found.

The same occurs on the most recent version, 1.4.4, with one odd difference: the pipe between the zcat and genomedata-load-data invocation is duplicated, yielding a syntax error in the trace, but not in the actual error obtained:

>> <cytomod.py> 2019-01-02T15:28:37.717452 Creating genomedata archive
                                           in ../data//archive/.
>> Using temporary Genomedata archive: /tmp/genomedata.S_x9Q7
>> 2019-01-02T15:28:37.721706: Loading sequence files:
../data/mm9_chrY.fa.gz
>> 2019-01-02T15:30:59.892858: Opening Genomedata archive with 6 tracks
>> 2019-01-02T15:30:59.899173: Loading data
>> Loading data for track: mm9_chrY-only_5mC-fakeData.bedGraph.gz
zcat ../data/mm9_chrY-only_5mC-fakeData.bedGraph.gz |  | genomedata-load-data -v /tmp/genomedata.S_x9Q7 mm9_chrY-only_5mC-fakeData.bedGraph.gz
Error creating genomedata.
>> Cleaning up... done
Traceback (most recent call last):
  File "../../src/cytomod.py", line 714, in <module>
    seqfilenames=FASTA_file_list, verbose=args.verbose)
  File "/home/cviner2/.local/lib/python2.7/site-packages/genomedata/load_genomedata.py", line 126, in load_genomedata
    maskfilename, verbose=verbose)
  File "/home/cviner2/.local/lib/python2.7/site-packages/genomedata/_load_data.py", line 115, in load_data
    loader = Popen(load_cmd, stdin=loader_input_process.stdout)
  File "/cvmfs/soft.computecanada.ca/nix/store/4x0hqnpd0hfh62m3apkxmz8hz3hlsikx-python-2.7.13-env/lib/python2.7/subprocess.py", line 390, in __init__
    errread, errwrite)
  File "/cvmfs/soft.computecanada.ca/nix/store/4x0hqnpd0hfh62m3apkxmz8hz3hlsikx-python-2.7.13-env/lib/python2.7/subprocess.py", line 1024, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

This error appears only related to the message itself, since both tests succeed on both versions when the PATH environment variable is corrected to include $HOME/.local/bin.

It might be best to add some test in the Python code, prior to invocation of any external binaries, that yields a clear error message if any needed external scripts or binaries are not on the path, as expected.

EricR86 commented 5 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Since this is just rooted in an admittedly poor error message I've updated the title to reflect the issue.