NDAR / nda-tools

Python package for interacting with NDA web services. Used to validate, submit, and download data to and from NDA.
MIT License
48 stars 22 forks source link

incorrectly handles error messages without columnName field #13

Open yarikoptic opened 5 years ago

yarikoptic commented 5 years ago

gory details:

$ vtcmd /data/NDA/output_v2/image03.txt

Validating files...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:02<00:00,  2.21s/it]

Error! Check file: /data/NDA/output_v2/image03.txt
> /home/XXX/proj/nda-tools/NDATools/Validation.py(138)output()
-> column = v['columnName']
(Pdb) up
> /home/XXX/proj/nda-tools/NDATools/clientscripts/vtcmd.py(170)validate_files()
-> validation.output()
(Pdb) l
165             if response['status'] == Status.SYSERROR:
166                 print('\nSystemError while validating: {}'.format(file))
167                 print('Please contact NDAHelp@mail.nih.gov')
168             elif response['errors'] != {}:
169                 print('\nError! Check file: {}'.format(file))
170  ->     validation.output()
171         print('Validation report output to: {}'.format(validation.log_file))
172     
173         if warnings:
174             validation.get_warnings()
175             print('Warnings output to: {}'.format(validation.log_file))
(Pdb) p file
'/data/NDA/output_v2/image03.txt'
(Pdb) down
> /home/XXX/proj/nda-tools/NDATools/Validation.py(138)output()
-> column = v['columnName']
(Pdb) l
133                              'MESSAGE':'None','RECORD': 'None'})
134                     else:
135                         for error, value in response['errors'].items():
136                             for v in value:
137                                 import pdb; pdb.set_trace()
138  ->                             column = v['columnName']
139                                 message = v['message']
140                                 try:
141                                     record = v['recordNumber']
142                                 except KeyError:
143                                     record = ' '
(Pdb) p response
{'status': 'CompleteWithErrors', 'errors': {'data_structure': [{'message': 'File not recognized as a data file, and was not associated or referenced from a data file.'}]}, 'scope': None, 'expiration_date': '2019-06-27T11:12:59.663-0400', 'done': True, 'id': '9fb117d6-9a51-4645-90a5-1db0fe2f4a9b', 'manifests': [], '_links': {'self': {'href': 'https://nda.nih.gov/api/validation/9fb117d6-9a51-4645-90a5-1db0fe2f4a9b'}}, 'warnings': {}, 'associated_file_paths': [], 'short_name': None}
(Pdb) p response['errors']
{'data_structure': [{'message': 'File not recognized as a data file, and was not associated or referenced from a data file.'}]}
(Pdb) p error
'data_structure'
(Pdb) p value
[{'message': 'File not recognized as a data file, and was not associated or referenced from a data file.'}]

as you can see there is only message

obenshaindw commented 5 years ago

@yarikoptic this is a good find, and we should fix this.

We have also considered adding a check to NDATools to 'sniff' the delimiter (comma vs. tab) in the input file, and set the Content-Type for POST operation accordingly. The validation service actually supports text/csv, text/tsv, application/json, and application/xml for validation and the media type is negotiated through HTTP headers. NDATools assumes you will use csv, but it could pretty easily support tsv.

yarikoptic commented 5 years ago

Great, thanks for the explanation!