The order of validation is currently the same for all filetypes
Not all filetypes need the same validation
Currently for some filetypes the validation is redundant or not in the optimal order
For :omnivore types in particular the problem exists that:
The digest function in mapnik-omnivore is called 3 times on one file
A tilelive-bridge source is created twice for one file
Even though the digest function in mapnik-omnivore calls sniffer.quaff internally we still call sniffer.quaff first before calling digesthere
If any one of these functions is expensive (they are for big files) then the cost compounds.
This PR starts refactoring the code so that validation does not try to be generic for all filetypes. Rather we call into the filetype specific validator and inside each we can optimize the order and necessary calls. In particular this fixes the case that breaks in #68 so that the existing KML layer count validation is able to run before tilelive.info which would never finish because it is so expensive.
As I was reviewing #68 I noticed that, overall:
For
:omnivore
types in particular the problem exists that:digest
function inmapnik-omnivore
is called 3 times on one filetilelive-bridge
source is created twice for one filedigest
function inmapnik-omnivore
callssniffer.quaff
internally we still callsniffer.quaff
first before callingdigest
hereIf any one of these functions is expensive (they are for big files) then the cost compounds.
This PR starts refactoring the code so that validation does not try to be generic for all filetypes. Rather we call into the filetype specific validator and inside each we can optimize the order and necessary calls. In particular this fixes the case that breaks in #68 so that the existing KML layer count validation is able to run before
tilelive.info
which would never finish because it is so expensive./cc @mapsam @who8mycakes