rmkoesterer / uga

Universal Genome Analyst (uga) is an open, flexible, and efficient tool for the distribution, management, and visualization of whole genome data analyses.
GNU General Public License v3.0
2 stars 1 forks source link

duplicate IDs produce empty output files #9

Open rsherva opened 8 years ago

rsherva commented 8 years ago

If there are duplicate IDs in the phenotype file the runs will finish (with no valid output) but no error detectable during compiling or meta. There is an error in the log file, however.

rmkoesterer commented 8 years ago

Hey,

Can you attach the log file with the error?

rsherva commented 8 years ago

/usr2/faculty/sherva/.conda/envs/uga_v2.0b5/lib/python2.7/site-packages/uga-2.0b5-py2.7-linux-x86_64.egg/uga/RunSnv.py:110: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 1394 but corresponding boolean dimension is 2652 models_obj[n].get_snvs(cfg['buffer'])

rmkoesterer commented 8 years ago

For a longitudinal analysis, duplicate IIDs should be allowed, but it would be expected that the --fid and --iid match. It seems like the error is a result of them not matching. I'll make sure that when --fid and --iid don't match, then duplicated IIDs generate an appropriate error.