added data_io and make changes for dsb2018 setup accordingly

YiwenShaoStephen commented 6 years ago

added data_io.py to save processed image_with_mask as numpy arrays. Also enable a "cache=True" option to decide whether read all data in memory once or do it one by one. The changes on other scripts are made accordingly for dsb2018 setup. @aarora8 It would be great if you can try this on madcat.

aarora8 commented 6 years ago

Ok, thanks. I will test it with madcat setup.

On Tue, May 22, 2018, 23:34 Yiwen Shao notifications@github.com wrote:

added data_io.py to save processed image_with_mask as numpy arrays. Also enable a "cache=True" option to decide whether read all data in memory once or do it one by one. The changes on other scripts are made accordingly for dsb2018 setup. @aarora8 https://github.com/aarora8 It would be great if you can try this on madcat.

You can view, comment on, or merge this pull request online at:

https://github.com/waldo-seg/waldo/pull/42 Commit Summary

added data_io and make changes for dsb2018 setup accordingly

File Changes

M egs/dsb2018/v1/local/dataset.py https://github.com/waldo-seg/waldo/pull/42/files#diff-0 (42)

M egs/dsb2018/v1/local/process_data.py https://github.com/waldo-seg/waldo/pull/42/files#diff-1 (45)

M egs/dsb2018/v1/local/segment.py https://github.com/waldo-seg/waldo/pull/42/files#diff-2 (4)

M egs/dsb2018/v1/local/train.py https://github.com/waldo-seg/waldo/pull/42/files#diff-3 (8)

A scripts/waldo/data_io.py https://github.com/waldo-seg/waldo/pull/42/files#diff-4 (36)

Patch Links:

https://github.com/waldo-seg/waldo/pull/42.patch

https://github.com/waldo-seg/waldo/pull/42.diff

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/waldo-seg/waldo/pull/42, or mute the thread https://github.com/notifications/unsubscribe-auth/AcFBRZKEn2zXo8JJXpe4SzGXcLYd0aDeks5t1NiwgaJpZM4UJt7X .

aarora8 commented 6 years ago

It is working successfully with MADCAT Arabic setup.

danpovey commented 6 years ago

Great!

On Wed, May 23, 2018 at 2:24 PM, Yiwen Shao notifications@github.com wrote:

@YiwenShaoStephen commented on this pull request.

In scripts/waldo/data_io.py https://github.com/waldo-seg/waldo/pull/42#discussion_r190352424:

+import numpy as np + + +class DataSaver:

def init(self, dir):

self.dir = dir

if not os.path.exists(self.dir):

os.makedirs(self.dir)

os.makedirs(self.dir + '/numpy_arrays')

def write_image(self, name, image_with_mask):

""" This function accepts a image_with_mask object and its name, and saves

its img, mask and object_class as a numpy array under the given directory (

i.e. dir/numpy_arrays/name.suffix.npy)

"""

img = image_with_mask['img']

OK, I will handle all of it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/waldo-seg/waldo/pull/42#discussion_r190352424, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu5Gtd71XPRa5qs-hoFQxCgd1-0-Kks5t1alYgaJpZM4UJt7X .

YiwenShaoStephen commented 6 years ago

Now the shared dataset class WaldoDataset is added and we don't need dataset.py anymore. And image_ids.txt is written after all the data is read. Also make changes on run.sh to fit with @hhadian newest update. @aarora8 please see if such pipeline can work well with madcat dataset.

danpovey commented 6 years ago

great progress! merging.

waldo-seg / waldo

added data_io and make changes for dsb2018 setup accordingly #42

@YiwenShaoStephen commented on this pull request.