hakyim / TO-DELETE-PrediXcan

Code for the in-dev PrediXcan Project
MIT License
28 stars 82 forks source link

Small Updates (error messages) #22

Closed hainswor closed 7 years ago

hainswor commented 7 years ago

1) Since dosage files are expected to be gzipped, added code to look for .gz suffix. Keeps non-dosage files with same prefix (i.e. maybe .log files by plink) from being read as input.

2) Added error message and system exit if user forgets to specify --predict and/or --assoc (elsewise code runs with no output or error).

3) With the availability of multiple tissues for imputation, we have found it useful to specify the output file (i.e. with tissue name) name when running --predict. If --pred_exp is not specified 'predicted_expression.txt' is the default and it is placed in --output_dir. If --pred_exp is specified with --predict, default behavior did not place in --output_dir.

4) Added sys.exit(1) if prediction file already exists.

hakyim commented 7 years ago

Thank you @hainswor for your contribution.

@ScottPDickinson could you please take care of this pull request?

ScottPDickinson commented 7 years ago

This all looks good, but for point 3, I wonder if it would be more useful to specify the output with a prefix instead of a directory. So you could do something like --predict --out_prefix output/liver and then the output file would be output/liver.predicted_expression.txt. That way you can have all of your tissue analyses in the same directory, but have the different file names. The --pred_exp option is there mainly for when you have to specify input for performing an association, and it could get confusing to specify an input file and then have the software add the output directory before.

hainswor commented 7 years ago

I agree, incorporating an --out_prefix option makes more sense, it would help avoid confusion if needing to specify a file for --assoc. That would be a helpful feature.

hainswor commented 7 years ago

Hi @ScottPDickinson, I added an --output_prefix option that will append the specified text with an underscore to the predicted expression file and and association file. See if this addresses the point you previously raised. Thanks so much!

ScottPDickinson commented 7 years ago

Thanks @hainswor! I'm merging onto develop for the moment until I can update some of the documentation and example files.

ScottPDickinson commented 7 years ago

Hey @hainswor, I just made some changes to the 'develop' branch. I removed the output_dir option all together, so user's will just have to include the directory as part of the output_prefix. I also changed some default behavior so the software won't stop if you already have a file with the output you specified. It will just overwrite it, a la unix philosophy. Would you mind checking it out?

hainswor commented 7 years ago

@ScottPDickinson sounds great! I'll look at this weekend or early next week. Thanks!

hainswor commented 7 years ago

@ScottPDickinson I made some minor changes to the develop branch. I think your updates made a lot of sense and will be helpful for analysis workflow. Thanks for your work on this!