Open bruesba opened 5 years ago
Hi @bruesba ,
Pretty sure this is because we haven't yet implemented using .pngs as data sources. That's something that is on the list of to-dos, but we haven't gotten to it; it looks like it would essentially require transforming the imageio.core.util.Array
object into a numpy array or torch tensor. Alternatively (until this is implemented), you should be able to convert your source data to TIFFs and be on your way.
-Nick
Hey Nick,
You're right, after converting my input data to .tiff, training works as expected. I thought I'd trained on .pngs before but must have mixed something up! Inference from pngs is already possible, isn't it? Thank you for your hard work
@bruesba,
We haven't explicitly implemented it yet, but if you try it and it works, let me know ;) a lot of the back-end stuff that we use (e.g. scikit-image and opencv) can work with .pngs, it just depends on whether the subtle differences in i/o with different formats cause problems - such as the one you raised here.
-N
Inferring from a .png using the pre-trained XDXD network does seem to work (i.e. does not throw an error), but the output is not binary as it's probably supposed to be. Could this explain the vague output reported down in the thread of issue #212?
Here are the outputs of the same photo in .tif and in .png (from my own dataset; the .png was in RGB rather than BGR but the point still stands). Is the preds_to_binary operation automatically carried out on .tifs by the Inferer? If so, what bg_threshold is used?
Good to know that .pngs work.
The preds_to_binary
operation doesn't get run automatically; the goal was to leave raw probability (or probability-like) outputs in case users want to do something interesting with the inference outputs (such as combining probability estimates from different models in an ensembling step). Binarization runs during polygonization of outputs (see sol.vector.mask.mask_to_poly_geojson()
, which currently defaults to setting the threshold at 0
). Alternatively, you can directly binarize outputs with sol.vector.mask.preds_to_binary()
I'm going to close this because .png support requests are already covered by #184.
I'm not sure I understand - you're saying binarisation does run automatically in the polygonisation phase of inference, right? Wouldn't a default threshold of 0 then return a white image, i.e. one where every pixel is labelled a building with sufficient certainty? Or is binarisation not automatically carried out when Inferer is called and is a greyscale output expected? I'm confused because the image below is the output I get when I run Inferer on a .tif and I'm not sure at what level of confidence buildings are highlighted. (please forgive the print-screen)
By contrast, a non-binarised image is returned when I attempt to infer from a .png, (which fits your comment about raw probability better if I understand correctly):
Is this behaviour intended? What threshold is used in the former output? To be clear, my code only encompasses:
pre_trained = r'path/to/config' config = sol.utils.config.parse(pre_trained) inferer = sol.nets.infer.Inferer(config) inference_data = sol.nets.infer.get_infer_df(config) inferer(inference_data)
Best, Blue
Huh. So Inferer.infer()
doesn't binarize at all - that's all done later. It looks to me like the predictions you're getting out of the TIF-formatted image are just better than the predictions you're getting out of the PNG (higher confidence, i.e. bigger difference between background and foreground pixels). You can see a few pixels around some edges in the TIF output that you pasted here that don't look quite as bright - those may in fact be something other than 0 or 1, unless I'm mistaken.
I'm assuming those two outputs were generated from the same trained model? If so, my guess would be that when the PNG gets loaded in, the values in the array produced are different - maybe they're scaled differently - and since the model was trained on TIF-formatted inputs, it doesn't do as well with the PNG inputs. You could check this by loading both the TIF and PNG inputs with sol.utils.io.imread()
and checking to see if the values in the array produced are in fact different. That's just a guess though...
Regardless, this is something we need to clarify in the documentation, so thanks for bringing it to our attention.
Any idea how the following error can be avoided? I'm running into it while attempting to train the XDXD model. I haven't seen it before but am using .pngs for both the images batch and that of the masks as usual.
The masks are in 8-bit greyscale (PIL 'L'-mode) since 3-dimensional B/W throws: