not urgent, but starting to track some wishlist stuff that we can have in our repo and use in presentations etc.. for the page classifier:
how many documents we were able to classify using filenames only
how many additional documents of each type we are able to retrieve by using thumbnail classification instead of filenames
for documents that could not be classified using filenames only, take 10 instances of each document type (as labeled by the thumbnail classifier) and calculate accuracy.
not urgent, but starting to track some wishlist stuff that we can have in our repo and use in presentations etc.. for the page classifier: