Closed reynoldtan closed 10 months ago
The fix to this issue is very specific to pdf file. It needs to be a check of file content to have content structure (by tabs or by comma) and infer the validity of the file. I will revise the rule for this one
We already discussed this at our latest meeting, but to summarize it here: we think this approach of checking for a disguised pdf is a bit too specific. A broader approach would be to validate that the file is tab-delimited by parsing the first line. Ideally, we would go so far as to implement a method in the Tripal Importer class that can do checks for tab-delimited text and the number of columns, since this functionality is needed by most (maybe all) importers written for Tripal.
**Issue #59 - PDF as tsv file
Motivation
Data file fails to trigger validation when pdf file converted into a tsv file is uploaded.
What does this PR do?
Please describe each things this PR does. For example, a PR may 1) solve a specific bug, 2) create an auomated test to ensure it doesn't return.
Testing
Below is a tsv file from a pdf file where the pdf extension was replace with tsv. sample.txt
Provide this file into the importer file field and validate to trigger an incorrect file format/extension error.