Closed icaoberg closed 3 years ago
@mccalluc the error is the following
$ src/validate_submission.py --dataset_ignore_globs=\*.tsv --local_directory "$D"
/hive/users/hive/icaoberg/ingest-validation/lib64/python3.6/site-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.3) or chardet (4.0.0) doesn't match a supported version!
RequestsDependencyWarning)
Traceback (most recent call last):
File "src/validate_submission.py", line 155, in <module>
exit_status = main()
File "src/validate_submission.py", line 147, in main
errors = submission.get_errors()
File "/hive/users/hive/icaoberg/ingest-validation/ingest-validation-tools/src/ingest_validation_tools/submission.py", line 101, in get_errors
tsv_errors = self._get_tsv_errors()
File "/hive/users/hive/icaoberg/ingest-validation/ingest-validation-tools/src/ingest_validation_tools/submission.py", line 154, in _get_tsv_errors
self._get_single_tsv_external_errors(assay_type, path)
File "/hive/users/hive/icaoberg/ingest-validation/ingest-validation-tools/src/ingest_validation_tools/submission.py", line 186, in _get_single_tsv_external_errors
assay_type, data_path)
File "/hive/users/hive/icaoberg/ingest-validation/ingest-validation-tools/src/ingest_validation_tools/submission.py", line 216, in _get_data_dir_errors
assay_type, data_path, dataset_ignore_globs=self.dataset_ignore_globs)
File "/hive/users/hive/icaoberg/ingest-validation/ingest-validation-tools/src/ingest_validation_tools/validation_utils.py", line 30, in get_data_dir_errors
schema = get_directory_schema(type)
File "/hive/users/hive/icaoberg/ingest-validation/ingest-validation-tools/src/ingest_validation_tools/schema_loader.py", line 26, in get_directory_schema
schema = load_yaml(_directory_schemas_path / f'{directory_type}.yaml')
File "/hive/users/hive/icaoberg/ingest-validation/ingest-validation-tools/src/ingest_validation_tools/yaml_include_loader.py", line 19, in load_yaml
expanded_text = _load_includes(path)
File "/hive/users/hive/icaoberg/ingest-validation/ingest-validation-tools/src/ingest_validation_tools/yaml_include_loader.py", line 24, in _load_includes
text = path.read_text()
File "/usr/lib64/python3.6/pathlib.py", line 1196, in read_text
with self.open(mode='r', encoding=encoding, errors=errors) as f:
File "/usr/lib64/python3.6/pathlib.py", line 1183, in open
opener=self._opener)
File "/usr/lib64/python3.6/pathlib.py", line 1037, in _opener
return self._accessor.open(self, flags, mode)
File "/usr/lib64/python3.6/pathlib.py", line 387, in wrapped
return strfunc(str(pathobj), *args)
FileNotFoundError: [Errno 2] No such file or directory: '/hive/users/hive/icaoberg/ingest-validation/ingest-validation-tools/src/ingest_validation_tools/directory-schemas/None.yaml'
maybe improve the error message? I mean, this is an edge case.
Two separate issues :
scRNAseq-10xGenomics
, and sc
is a typo: Suggest following up with Chris on slack, but in your court... If a code change is required, please file a separate issue..txt
for github to be happy.) There is a fixture (bad-no-such-type
) that should cover this scenario, but obviously something is different about this case: Thanks for finding it.Retitling.
@mccalluc do you have everything you need to address the stack trace? This is my current blocker; I can debug if that would be helpful.
Confirmed that it still exists at e16196b, which is master HEAD now.
Looking at it now...
After saving locally and retitling, I get the expected errors:
$ mkdir /tmp/fake-submission
$ mv ~/Downloads/UFLA_10x_SP-LY_Metadata_120420.tsv\ -\ UFTMC_10x_120420.tsv.tsv /tmp/fake-submission/ufla-10x-metadata.tsv
$ src/validate_submission.py --local_directory /tmp/fake-submission
Metadata TSV Errors:
/tmp/fake-submission/ufla-10x-metadata.tsv (as scrnaseq):
External:
? row 2, referencing /tmp/fake-submission/https:/app.globus.org/file-manager?origin_id=24c2ee95-146d-4513-a1b3-ac0bfdb7856f&origin_path=%2Fprotected%2FUniversity%20of%20Florida%20TMC%2F638799c2725a0c88ec7ee389cb98884f%2F
: No such file or directory: /tmp/fake-submission/https:/app.globus.org/file-manager?origin_id=24c2ee95-146d-4513-a1b3-ac0bfdb7856f&origin_path=%2Fprotected%2FUniversity%20of%20Florida%20TMC%2F638799c2725a0c88ec7ee389cb98884f%2F
....
Rerunning with --dataset_ignore_globs=\*.tsv
changes nothing.
If I rename to the orginal name, I get a different, shorter, error:
$ mv /tmp/fake-submission/ufla-10x-metadata.tsv /tmp/fake-submission/UFLA_10x_SP-LY_Metadata_120420.tsv\ -\ UFTMC_10x_120420.tsv
$ src/validate_submission.py --local_directory /tmp/fake-submission
Metadata TSV Errors:
Missing: There are no effective TSVs.
Reference Errors:
No References:
- UFLA_10x_SP-LY_Metadata_120420.tsv - UFTMC_10x_120420.tsv
Library versions have recently been upgraded, so do a fresh pip install.
If you still have problems, can you provide me the output of python --version
and pip freeze
and find $D
(or whatever the submission directory is), and the operating system you're on?
In the file
level-1.yaml
the assay type is defined asscRNAseq-10xGenomics
. However, in the test dataset from the data provider, it is defined assnRNAseq-10Xgenomics
in the metadata.tsv file.@pdblood and @jswelling which one is correct?