ckan / ckanext-qa

CKAN QA Extension
MIT License
26 stars 51 forks source link

QA rating without downloading files #76

Open jze opened 1 year ago

jze commented 1 year ago

I would like to suggest that the extension be fundamentally redesigned so that it is not necessary to download complete files.

It is not ideal to download the complete files. Some files are several gigabyte large. For other resources, a lot of computing time is required in the source system.

Therefore, I suggest different levels to determine the file format:

  1. Trust the format specification made by the user.
  2. Do a HTTP HEAD request and trust the webserver's answer.
  3. Download a few bytes of the resource and use file magic numbers.
  4. Download the complete file and do the analysis.
Zharktas commented 1 year ago

You are welcome to contribute, this extension is on quite minimal maintenance, so no major refactions will happen any time soon.