Open ThrawnCA opened 3 years ago
The latest 'develop' branch on qld-gov-au/ckan-ex-qgov includes logic to allow archives to specify any resource format, so long as the archive extension and contents are well-formed and matching.
Is all of this work here now @ThrawnCA? https://github.com/qld-gov-au/ckanext-resource-type-validation
It'd be interesting to see if these 2 plugins would play nicely together... possibly we'd need an option to disable the blacklisting in ckanext-security if someone wanted to use ckanext-resource-type-validation alongside it.
Is all of this work here now @ThrawnCA? https://github.com/qld-gov-au/ckanext-resource-type-validation
Yeah, it was pretty much independent so we split it out. Still working on adding samples and tests for the various types we want to support, but the logic seems to work at this point.
It'd be interesting to see if these 2 plugins would play nicely together... possibly we'd need an option to disable the blacklisting in ckanext-security if someone wanted to use ckanext-resource-type-validation alongside it.
Offhand I can't think of a reason for them to clash. Ensuring that types are consistent should be compatible with applying a blacklist, and ckanext-resource-type-validation
has a pretty simple interaction with CKAN.
It should be simple enough to try it out, anyway. If you install the 'develop' branch, you should have the latest bug fixes.
@markstuart We've released ckanext-resource-type-validation version 1.0.0 now. Feedback is welcome!
CKAN currently does not check that uploaded files actually match the specified format, which opens the door to malicious uploads. For example, it's entirely possible to upload the EICAR test virus and tell CKAN it's a PDF document; anyone who downloads it will likely be offered the opportunity to run it (eg Firefox on Ubuntu offered to open it with the Mono runtime).
The ckan-security mailing list indicated that this is a won't-fix, because some CKAN sites may choose to let people specify formats that don't match, eg a ZIP archive containing Office documents may be legitimately marked as DOC format. There may still be options there, however, such as checking at least that when a ZIP extension is used, content type sniffing agrees that it's a ZIP file, and vice versa. Or, ideally, checking that it actually contains DOC files.
We've developed an implementation within https://github.com/qld-gov-au/ckan-ex-qgov that doesn't have any special handling of archive types, but does implement IResourceController to check that the file extension, format, and sniffed file type are all compatible.