Closed danpovey closed 6 years ago
Are these functions for data preparation purpose?
I guess this task is similar to mask generation task done for madcat arabic images. There to fill all points inside a polygon (rectangle in that case), it looped over all points inside each bounding box. Since the madcat images are high resolution images (400k points in a bounding box), it resulted in making the code slow, as python is very good with loops. For 42k madcat images after speedup and parallelization, it can take around 18 hrs. Though it might not be necessary in this case, as for madcat case we had the overlapping bounding box constraint. But if possible should we do this task in C++.
Ok, I get it. BTW, to provide an alternative, see https://www.kaggle.com/c/data-science-bowl-2018#evaluation. They use another method for encoding the mask instead of using polygon idea here. Not sure which one is more feasible and efficient.
The data types are initially for data preparation but would likely be reusable for the nnet output.
I assume this conversation right now is about enumerating all pixels inside a polygon.
Regarding efficiency: for the mask generation, at some point it would be necessary to enumerate all pixels, because we need to create the mask array. And I want this to cover non-convex polygons such as bent text, so the kaggle approach isn't quite general enough. Let's just get something working for now and worry more about efficiency later; we can use the simple approach for regression testing.
I suggest to make the code find all pixels for now, and we can try more efficient versions later. It might be possible to use some kind of trick to do this fast.. for example draw each line in the polygon in such a way that for each height (i.e. each y value) each line has exactly one x value present, and then store those x values as lists indexed by y value; and then use an even/odd approach to fill in the locations between alternating x values. Be careful about corners where 2 lines are present.
On Sun, May 6, 2018 at 11:30 PM, Yiwen Shao notifications@github.com wrote:
Ok, I get it. BTW, to provide an alternative, see https://www.kaggle.com/c/data-science-bowl-2018#evaluation. They use another method for encoding the mask instead of using polygon idea here. Not sure which one is more feasible and efficient.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/waldo-seg/waldo/pull/10#issuecomment-386948152, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu7pbBsxLf9eRjYP3pG13vW45M9j7ks5tv7_WgaJpZM4T0VCb .
Actually, let's not rule out dumping the masks to disk somehow. I don't know enough about how I/O is done in PyTorch to know exactly how we should do it, but I'm sure Yiwen will have something to say about that. (And let's try to make it possible to use char's or short's to do this, if Python lets us, as it would take less disk space).
As long as we have the functionality to turn the polygon-based representation into a mask-based representation, we can always do that in data preparation instead of during training.
On Sun, May 6, 2018 at 11:37 PM, Daniel Povey dpovey@gmail.com wrote:
I assume this conversation right now is about enumerating all pixels inside a polygon.
Regarding efficiency: for the mask generation, at some point it would be necessary to enumerate all pixels, because we need to create the mask array. And I want this to cover non-convex polygons such as bent text, so the kaggle approach isn't quite general enough. Let's just get something working for now and worry more about efficiency later; we can use the simple approach for regression testing.
I suggest to make the code find all pixels for now, and we can try more efficient versions later. It might be possible to use some kind of trick to do this fast.. for example draw each line in the polygon in such a way that for each height (i.e. each y value) each line has exactly one x value present, and then store those x values as lists indexed by y value; and then use an even/odd approach to fill in the locations between alternating x values. Be careful about corners where 2 lines are present.
On Sun, May 6, 2018 at 11:30 PM, Yiwen Shao notifications@github.com wrote:
Ok, I get it. BTW, to provide an alternative, see https://www.kaggle.com/c/data-science-bowl-2018#evaluation. They use another method for encoding the mask instead of using polygon idea here. Not sure which one is more feasible and efficient.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/waldo-seg/waldo/pull/10#issuecomment-386948152, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu7pbBsxLf9eRjYP3pG13vW45M9j7ks5tv7_WgaJpZM4T0VCb .
To clarify: when I talk about 'dumping the masks to disk' I am talking about dumping a numpy array of int that contains the object id, like the 'mask' member in Yiwen's nucleus-detection example. I am assuming that must be standard in object detection. We could use 'char' for compression if the libraries allow it.
Sure. I will look into PyTorch I/O mechanism later to see if it is possible.
Ok, thank you. Got it.
From: Daniel Povey notifications@github.com Sent: Sunday, May 6, 2018 11:37:09 PM To: waldo-seg/waldo Cc: Ashish Arora; Comment Subject: Re: [waldo-seg/waldo] [scripts] add some types (unfinished) (#10)
I assume this conversation right now is about enumerating all pixels inside a polygon.
Regarding efficiency: for the mask generation, at some point it would be necessary to enumerate all pixels, because we need to create the mask array. And I want this to cover non-convex polygons such as bent text, so the kaggle approach isn't quite general enough. Let's just get something working for now and worry more about efficiency later; we can use the simple approach for regression testing.
I suggest to make the code find all pixels for now, and we can try more efficient versions later. It might be possible to use some kind of trick to do this fast.. for example draw each line in the polygon in such a way that for each height (i.e. each y value) each line has exactly one x value present, and then store those x values as lists indexed by y value; and then use an even/odd approach to fill in the locations between alternating x values. Be careful about corners where 2 lines are present.
On Sun, May 6, 2018 at 11:30 PM, Yiwen Shao notifications@github.com wrote:
Ok, I get it. BTW, to provide an alternative, see https://www.kaggle.com/c/data-science-bowl-2018#evaluation. They use another method for encoding the mask instead of using polygon idea here. Not sure which one is more feasible and efficient.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/waldo-seg/waldo/pull/10#issuecomment-386948152, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu7pbBsxLf9eRjYP3pG13vW45M9j7ks5tv7_WgaJpZM4T0VCb .
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/waldo-seg/waldo/pull/10#issuecomment-386948925, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AcFBRZmVXYiC_bN9uXmqk-IXjG7bYJ7vks5tv8FlgaJpZM4T0VCb.
Regarding pytorch I/O: it seems to like to dump the images and labels together in a tar file, from looking at you process_data.py in egs/dsb2018/v1/local/. One possibility for MADCAT is to have a data preprocessing stage, similar to your process_data.py but run in parallel, in which we would downsample the data slightly (e.g. by a factor of 2 or 4) and at the same time compute the masks, and dump in a tar file like PyTorch likes to do. Ashish, assume for the time being that that's what we'll do.
On Sun, May 6, 2018 at 11:54 PM, Ashish Arora notifications@github.com wrote:
Ok, thank you. Got it.
From: Daniel Povey notifications@github.com Sent: Sunday, May 6, 2018 11:37:09 PM To: waldo-seg/waldo Cc: Ashish Arora; Comment Subject: Re: [waldo-seg/waldo] [scripts] add some types (unfinished) (#10)
I assume this conversation right now is about enumerating all pixels inside a polygon.
Regarding efficiency: for the mask generation, at some point it would be necessary to enumerate all pixels, because we need to create the mask array. And I want this to cover non-convex polygons such as bent text, so the kaggle approach isn't quite general enough. Let's just get something working for now and worry more about efficiency later; we can use the simple approach for regression testing.
I suggest to make the code find all pixels for now, and we can try more efficient versions later. It might be possible to use some kind of trick to do this fast.. for example draw each line in the polygon in such a way that for each height (i.e. each y value) each line has exactly one x value present, and then store those x values as lists indexed by y value; and then use an even/odd approach to fill in the locations between alternating x values. Be careful about corners where 2 lines are present.
On Sun, May 6, 2018 at 11:30 PM, Yiwen Shao notifications@github.com wrote:
Ok, I get it. BTW, to provide an alternative, see https://www.kaggle.com/c/data-science-bowl-2018#evaluation. They use another method for encoding the mask instead of using polygon idea here. Not sure which one is more feasible and efficient.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/waldo-seg/waldo/pull/10#issuecomment-386948152, or mute the thread https://github.com/notifications/unsubscribe-auth/ ADJVu7pbBsxLf9eRjYP3pG13vW45M9j7ks5tv7_WgaJpZM4T0VCb .
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/ waldo-seg/waldo/pull/10#issuecomment-386948925, or mute the thread< https://github.com/notifications/unsubscribe-auth/AcFBRZmVXYiC_bN9uXmqk- IXjG7bYJ7vks5tv8FlgaJpZM4T0VCb>.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/waldo-seg/waldo/pull/10#issuecomment-386950772, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu33UU3abomiDqSYjFq9PMEITKhACks5tv8V_gaJpZM4T0VCb .
Ok, thanks, will add process_data.py and an option for downsampling images. For polygon mask generation, can we have an overlapping polygon situation.
Yes there can be overlapping polygons, but we'll deal with it by ordering the polygons from "bottom-most" to "top-most". I was leaving that till later. For now just ignore the issue. We'll be writing the mask in an array of the form
mask[x,y] = object_id
and the top-most polygons will get written last and override previous ones.
On Mon, May 7, 2018 at 12:05 AM, Ashish Arora notifications@github.com wrote:
Ok, thanks, will add process_data.py and an option for downsampling images. For polygon mask generation, can we have an overlapping polygon situation.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/waldo-seg/waldo/pull/10#issuecomment-386951851, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVuy2qRVkCmiHgJ_v38POjaKaPr4wWks5tv8gBgaJpZM4T0VCb .
ok, thanks, got it.
merging this so it doesn't block anyone. @aarora8, please finish any TODOs if you get time.
ok, thanks, will do.
This is my proposal for the types to use for data. @YiwenShaoStephen, can you please look over this ASAP? This code is mostly just comment without implementation, but let me know if you think it's workable and if you think there is a better way to do this.