Convnet labeler - Githubissues

mwharton3 commented 4 years ago

This (in work) branch will include functionality to label photo vs. document, orientation (nearest 90), and aspect ratio problems (and subsequent methods to fix them).

mwharton3 commented 4 years ago

Some notable issues that need fixing:

Quantize the convnet for speed and portability (~100MB or smaller weights files would be ideal)
Very serial processing at the moment, adding maps where race conditions aren't a problem is a strong need
Refactor the kaishi/core/image code. A couple files are getting bloated.
Add an optimal batch size detection function (or find it elsewhere). Running predictions on a large data set will need this info.

mwharton3 commented 4 years ago

@spencerR1992 @zzsi I'm going to go ahead and merge this since there are a ton of changes, but submit issues if you notice something problematic.

If you want to try to test what's been changed, make a small folder with documents and photos, add some random rotations, save, and then run the below commands:

from kaishi.image import Dataset
imd = Datset('path/to/images')
imd.run_pipeline()
imd.report()
imd.predict_and_label()
imd.report()
imd.transform_fix_rotation()
imd.save('path/to/output')

There will be some interim reports/etc. that you can check out, along with the output images. It's definitely not very good yet, but it's a working model at least.

kungfuai / kaishi

Convnet labeler #3