pepkit / eido

Validator for PEP objects
http://eido.databio.org
BSD 2-Clause "Simplified" License
4 stars 6 forks source link

Uniform eido validation behavior for inputs #35

Closed stolarczyk closed 1 year ago

stolarczyk commented 2 years ago

The validate_inputs function behaves differently and has more responsibilities than other validation functions, which was dictated by our use case in looper. Instead of raising an exception, it records missing files and calculates their sizes. Here's an example:

validate_inputs(sample=p.samples[0], schema="schema.yaml")

1 input files missing, job input size was not calculated accurately

Out[5]: 
{'missing': ['/Users/mstolarczyk/Desktop/testing/eido/file11A.txt'],
 'required_inputs': {'/Users/mstolarczyk/Desktop/testing/eido/file11A.txt',
  '/Users/mstolarczyk/Desktop/testing/eido/file11B.txt',
  '/Users/mstolarczyk/Desktop/testing/eido/file12A.txt',
  '/Users/mstolarczyk/Desktop/testing/eido/file12B.txt'},
 'all_inputs': {'/Users/mstolarczyk/Desktop/testing/eido/file11A.txt',
  '/Users/mstolarczyk/Desktop/testing/eido/file11B.txt',
  '/Users/mstolarczyk/Desktop/testing/eido/file12A.txt',
  '/Users/mstolarczyk/Desktop/testing/eido/file12B.txt'},
 'input_file_size': 0.0}

So based on this output it is the responsibility of the client software to decide what to do in case one or more files are missing.

Originally posted by @stolarczyk in https://github.com/pepkit/eido/issues/26#issuecomment-916345976

stolarczyk commented 2 years ago

I think the behavior of the validation functions should be consistent, i.e. the validate_inputs should raise exceptions or warnings if required_files or files are missing, respectively. And the file size calculation should be implemented elsewhere (looper).