Closed icaoberg closed 3 years ago
Upload directories should have this structure:
In the canonical form, there should only be TSVs and data directories at the top level. (As part of ingest, the metadata TSV is broken up into single lines, and each single line is put in the corresponding dataset, and the validation is invoked with a different set of flags... but I think you want the documented, canonical form.)
To fix:
extras
, or move under Proteomics/
.metadata.tsv
should have one data line, and data_path
should be Proteomics
validation_report.txt
, or move under Proteomics/extras/
@mccalluc so the issue still remains even after your suggestions.
python3 src/validate_upload.py --local_directory 7f1fd7b9c8c3745fcab037a2fa37f5b9/ --dataset_ignore_globs extras --dataset_ignore_globs '*metadata.tsv' --dataset_ignore_globs validation_report.txt
/hive/users/hive/ingest-validation-tools/lib/python3.7/site-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.25.11) or chardet (4.0.0) doesn't match a supported version!
RequestsDependencyWarning)
There are no references from any TSV to Proteomics.
Hint: If validation fails because of extra whitespace in the TSV, try:
src/cleanup_whitespace.py --tsv_in original.tsv --tsv_out clean.tsv.
I don't know if there is an issue with the metadata, unlikely. This file was created by the data provider with @cebriggs7135
I would prefer that on-going conversations be moved to slack: I will respond to things more quickly, and it's a better place for things where open and closed may be fuzzy.
tree
again.data_path
in the metadata tsv?@mccalluc @icaoberg Moved discussion to Slack, per Chuck's request.
See slack. Please do not reopen.
I am trying to validate an
LC-MS Top-Down
submission from Northwestern (for reference @jswelling has't not ingested this dataset nor @cebriggs7135 has not validated it using Airflow). When I use the commandI get this message from the tool
and I don't know how to interpret it. I ran
cleanup_whitespace.py
as suggested, just in case, and I get the same error.The directory structure is