Initial commit for splitting data and building a fish classifier

matthew-sochor-zz commented 7 years ago

Todo:

build a better model than a lame 80% accuracy
only run images through early steps once to generate features, then fit on features
try all of the different models and optimize 4 more fish and multinomial!?

thenomemac commented 7 years ago

I'll be working on web app tonight. Ping me with issues if you want any general advice on deep learning choices but i figure you guys want to build it yourself a bit for knowledge

On Apr 2, 2017 11:10 AM, "Matthew A Sochor" notifications@github.com wrote:

Todo:

build a better model than a lame 80% accuracy

only run images through early steps once to generate features, then fit on features

try all of the different models and optimize 4 more fish and multinomial!?

You can view, comment on, or merge this pull request online at:

https://github.com/matthew-sochor/fish.io.ai/pull/6 Commit Summary

Initial commit for splitting data and building a fish classifier

File Changes

A modeling/Makefile https://github.com/matthew-sochor/fish.io.ai/pull/6/files#diff-0 (26)

A modeling/modeling_scratch.ipynb https://github.com/matthew-sochor/fish.io.ai/pull/6/files#diff-1 (1955)

A modeling/split.py https://github.com/matthew-sochor/fish.io.ai/pull/6/files#diff-2 (23)

Patch Links:

https://github.com/matthew-sochor/fish.io.ai/pull/6.patch

https://github.com/matthew-sochor/fish.io.ai/pull/6.diff

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/matthew-sochor/fish.io.ai/pull/6, or mute the thread https://github.com/notifications/unsubscribe-auth/AOeuWiHL8_-HBUSM9UXI96jhIvXD0BvOks5rr7qCgaJpZM4Mw1EF .

thenomemac commented 7 years ago

Here is a quick and voice dictated list of things that I may try to improve based on my experience with deep learning. First I would essentially assume that in each stage of the pipeline the file system is your database in other words right things as Python scripts that either read from a directory list that comes in as the input or from a list of files and used generators where possible to interrail over these lists of files. Next I would read in the images with matplotlib image reading function or something image. These will convert the J pegs to an array in memory then make sure you get the ordering of the channels. Then use the bcolz at each step of the pipeline to spill every image to disc as a memory mapped array. Then for every image I would create like 20 distortions and dump those two desk. Then I would internet over the folder of all those distortions to dump the convolutional layers of the model. You probably want to dump several different layers at different points in the model to see which one works best. Then use Keras to take as an input those are raised of convolutional features and build a classifier on top of them whether that be logistic regression or a simple two layer model or more advanced but also common is to even build another convolutional Network on top of those features

On Apr 2, 2017 11:10 AM, "Matthew A Sochor" notifications@github.com wrote:

Todo:

build a better model than a lame 80% accuracy

only run images through early steps once to generate features, then fit on features

try all of the different models and optimize 4 more fish and multinomial!?

You can view, comment on, or merge this pull request online at:

https://github.com/matthew-sochor/fish.io.ai/pull/6 Commit Summary

Initial commit for splitting data and building a fish classifier

File Changes

A modeling/Makefile https://github.com/matthew-sochor/fish.io.ai/pull/6/files#diff-0 (26)

A modeling/modeling_scratch.ipynb https://github.com/matthew-sochor/fish.io.ai/pull/6/files#diff-1 (1955)

A modeling/split.py https://github.com/matthew-sochor/fish.io.ai/pull/6/files#diff-2 (23)

Patch Links:

https://github.com/matthew-sochor/fish.io.ai/pull/6.patch

https://github.com/matthew-sochor/fish.io.ai/pull/6.diff

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/matthew-sochor/fish.io.ai/pull/6, or mute the thread https://github.com/notifications/unsubscribe-auth/AOeuWiHL8_-HBUSM9UXI96jhIvXD0BvOks5rr7qCgaJpZM4Mw1EF .

thenomemac commented 7 years ago

Another tip. Make sure you swap the channel x y dimensions to match what tensorflow expects. And always set dtype in numpy to float32 to cut both read speed. Storage. And complete time in half. No one uses float64 in deep learning

On Apr 2, 2017 11:36 AM, "Josiah Olson" thenomemac@gmail.com wrote:

Here is a quick and voice dictated list of things that I may try to improve based on my experience with deep learning. First I would essentially assume that in each stage of the pipeline the file system is your database in other words right things as Python scripts that either read from a directory list that comes in as the input or from a list of files and used generators where possible to interrail over these lists of files. Next I would read in the images with matplotlib image reading function or something image. These will convert the J pegs to an array in memory then make sure you get the ordering of the channels. Then use the bcolz at each step of the pipeline to spill every image to disc as a memory mapped array. Then for every image I would create like 20 distortions and dump those two desk. Then I would internet over the folder of all those distortions to dump the convolutional layers of the model. You probably want to dump several different layers at different points in the model to see which one works best. Then use Keras to take as an input those are raised of convolutional features and build a classifier on top of them whether that be logistic regression or a simple two layer model or more advanced but also common is to even build another convolutional Network on top of those features

On Apr 2, 2017 11:10 AM, "Matthew A Sochor" notifications@github.com wrote:

Todo:

build a better model than a lame 80% accuracy

only run images through early steps once to generate features, then fit on features

try all of the different models and optimize 4 more fish and multinomial!?

You can view, comment on, or merge this pull request online at:

https://github.com/matthew-sochor/fish.io.ai/pull/6 Commit Summary

Initial commit for splitting data and building a fish classifier

File Changes

A modeling/Makefile https://github.com/matthew-sochor/fish.io.ai/pull/6/files#diff-0 (26)

A modeling/modeling_scratch.ipynb https://github.com/matthew-sochor/fish.io.ai/pull/6/files#diff-1 (1955)

A modeling/split.py https://github.com/matthew-sochor/fish.io.ai/pull/6/files#diff-2 (23)

Patch Links:

https://github.com/matthew-sochor/fish.io.ai/pull/6.patch

https://github.com/matthew-sochor/fish.io.ai/pull/6.diff

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/matthew-sochor/fish.io.ai/pull/6, or mute the thread https://github.com/notifications/unsubscribe-auth/AOeuWiHL8_-HBUSM9UXI96jhIvXD0BvOks5rr7qCgaJpZM4Mw1EF .

matthew-sochor-zz / fish.io.ai

Initial commit for splitting data and building a fish classifier #6