Closed APJansen closed 7 months ago
Hi @APJansen, just to make sure what do you mean by "the outside region"... would a "cube" like the red one in the figure be acceptable? Notice the upper right corner; although it contains data, there is no soil there.
No that's what I wanted to avoid.
Although I suppose it is something to consider as well. The downside of including such a box is that we would need an additional class, and the model will have to learn something kind of useless. The upside is (minor) that this sampling of cubes will be simpler, and (perhaps major?) that it is easier to cover also the edges of the scan.
Anyway I think it would be good to at least have the option to avoid such regions.
That was my guess. Anyway, filtering out those regions poses two problems that may be worse than adding an additional data label. Namely:
Do you have any idea on how to efficiently tackle those problems?
For the first point, I would say let's be wasteful to start with: chop off the first and last 200 horizontal slices, so that in the remaining the scan area is more or less constant. Then estimate its radius and make sure the corners are within that radius from the center of the scan. Later we can be more precise, but for now the goal is just to have any pipeline by which we can feed images into a model so we can do our first experiments. So let's not worry about the second point yet either.
We will need to cut the data up into cubes (or potentially with a different height, but same dimensions in the horizontal plane). This can be done just with numpy. Some considerations we'll need to take into account:
this function may be useful to convert the tif files to numpy.