hoffmangroup / genomedata

The Genomedata format for storing large-scale functional genomics data.
https://genomedata.hoffmanlab.org/
GNU General Public License v2.0
2 stars 1 forks source link

genomedata-load does not throw the correct error when given improper input #23

Open EricR86 opened 8 years ago

EricR86 commented 8 years ago

Original report (archived issue) by Rachel Chan (Bitbucket: rcwchan).


From pull request 15:

genomedata-load does not throw the correct error when given improper input. For example, when the input should be '-t TRACK=FILE.BED', and it is instead given '-t FILE.BED', it will throw the error:

#!python

Could not find track file: 

I suspect that error is not what is meant to happen, since it's located at:

#!python

100         if tracks is not None and len(tracks) > 0:
101             # Open hdf5 with track names
102             try:
103                 track_names = []
104                 for track_name, track_filename in tracks:
105                     if path(track_filename).isfile():
106                         if track_name not in track_names:  # No duplicates
107                             track_names.append(track_name)
108                     else:
109                         die("Could not find track file: %s" % track_filename)
110             except ValueError:
111                 die("Error saving data from tracks: %s" % tracks)

which assumes correct track_name, track_filenames. Instead, I think the ValueError here (earlier in the code) should be thrown instead:

#!python

    # Parse tracks into list of tuples
    try:
        tracks = []
        for track_expr in args.track:
            track_name, _, track_filename = track_expr.partition("=")
            tracks.append((track_name, track_filename))  # Tuple
    except ValueError:
        die(("Error parsing track expression: %s\Specify tracks"
             "in NAME=FILE form, such as: -t high=signal.high") % track_expr)

If "-t test.bed" is specified, track_name will be test.bed and track_filename will be "" (hence why the error message is blank).

With further investigation, even with "a=b=c", it passes ("a" becomes track_name, and "b=c" becomes track_filename), when I think one should expect a ValueError to be thrown.

I don't know how 'strict' we want to be with allowing/disallowing certain expressions (such as "a=b=c") or leave it up to the user to specify the correct expression. But if we want to implement checks, this can be solved with a check for the number of equals signs in the track_expr expression.

EricR86 commented 8 years ago

Original comment by Rachel Chan (Bitbucket: rcwchan).