Closed fattynoparents closed 2 months ago
This is intended. You're using the default model architecture with the
input [1,1800,0,3....
which uses 3 input channels (RGB) so forced
binarization doesn't really make sense. The switch is intended to be
used with single channel inputs, e.g. [1,1800,0,1...
where the
training data might be a mixture of grayscale and B/W images.
Ok thanks for the info!
The switch is intended to be used with single channel inputs, e.g.
[1,1800,0,1...
where the training data might be a mixture of grayscale and B/W images
Do you think it could be worth converting the scans to BW and use the binarization parameter? Can it improve the training essentially? The images I use are not colored scans so I can convert them to BW without quality loss.
Actually I have now converted my images to grayscale and trying this command I still get the same error:
ketos segtrain -d cuda:0 -f xml -t output_bw.txt --resize both --schedule cosine -i /path/to/model -o output/model -q early --min-epochs 50 -N 70 --suppress-regions --line-width 10 --workers 48 --augment --force-binarization -s "[1,1800,0,1 Cr7,7,64,2,2 Gn32 Cr3,3,128,2,2 Gn32 Cr3,3,128 Gn32 Cr3,3,256 Gn32 Cr3,3,256 Gn32 Lbx32 Lby32 Cr1,1,32 Gn32 Lby32 Lbx32]"
On 24/08/13 07:26AM, fattynoparents wrote:
Actually I have now converted my images to grayscale and trying this command I still get the same error:
ketos segtrain -d cuda:0 -f xml -t output_bw.txt --resize both --schedule cosine -i /path/to/model -o output/model -q early --min-epochs 50 -N 70 --suppress-regions --line-width 10 --workers 48 --augment --force-binarization -s "[1,1800,0,1 Cr7,7,64,2,2 Gn32 Cr3,3,128,2,2 Gn32 Cr3,3,128 Gn32 Cr3,3,256 Gn32 Cr3,3,256 Gn32 Lbx32 Lby32 Cr1,1,32 Gn32 Lby32 Lbx32]"
This is weird because a) the exception is triggered only in the case where the input spec is multi-channel and b) it works for me with this simple example:
ketos segtrain -f xml -s "[1,600,0,1 Cr7,7,64,2,2]" --force-binarization *.xml
No need to convert the images manually to grayscale by the way, it is sufficient to adjust the input spec.
No need to convert the images manually to grayscale by the way, it is sufficient to adjust the input spec.
Thanks for the tips!
I have now tried to simplify my command to see which part can cause this, and it appeared that it's the -i
parameter.
As soon as I remove the path to the existing model that I use as a base for training, the command starts running. Otherwise, I get the Invalid input spec 1, 1800, 0, 3, (0, 0) in combination with forced binarization.
exception.
Will there be a fix for this? I'm just trying all possible ways to improve my training score, so I thought that forcing binarization might also help..
Ah yes, sorry. You can't change the input spec of an existing model as
it changes the layer shapes and there are very limited circumstances
where you can get away with that. So if you want to fine-tune with B/W
data you need to start off from a model that's has 1 channel inputs (or
you can just binarize your input data manually with kraken binarize
which should roughly be equivalent even when not touching the input
spec).
In general though binarization should be considered harmful. It boosts accuracy for basic and clean scans slightly but degrades catastrophically in most cases.
Oh I see, thanks a lot for the explanation.
Do you possibly know other ways to improve the fine-tuning of a segmentation model?
So far I have tried various line widths, two types of schedule (cosine and reduceonplateau) and manually splitting the train and validation images.
I still don't get more than 0.47 val_mean_iu (I have noticed that the higher val_mean_ui corresponds well to the accuracy of my model in eScriptorium on unseen data).
On 24/08/19 10:49AM, fattynoparents wrote:
Do you possibly know other ways to improve the fine-tuning of a segmentation model?
It can be a bit finicky to get the best results and hyperparameter choice doesn't seem to affect it much. It is possible that your ontology is too complex so you can try merging for example some line classes to see if that will improve results.
I still don't get more than 0.47 val_mean_iu (I have noticed that the higher val_mean_ui corresponds well to the accuracy of my model in eScriptorium on unseen data).
In general the metrics aren't particularly meaningful. Better values do not always correspond to better segmentation results as the line pixel maps go through post-processing so the link between pixel map IoU and area-less polybaselines is rather tenuous. Obviously for region-only segmentation models this doesn't apply.
It can be a bit finicky to get the best results and hyperparameter choice doesn't seem to affect it much. It is possible that your ontology is too complex so you can try merging for example some line classes to see if that will improve results.
I don't have line classes at all, so there's nothing to merge unfortunately.
My main problem is that the model often doesn't cut the baselines at the margins, even when there's a clear border at which it should do it. So for example in this case the model would most probably draw a continous line like in the pic below, while I need a break at the margin:
Ah, in that case you might try to either increase the resolution of the network input (bumping up the 1800
in the definition to something higher) or create different line classes for the marginal text. Those are in different output maps so they will never be merged.
Thank you for the suggestion to set different classes for the marginal text, it seems to have improved my model sufficiently.
When trying to use
--force-binarization
insegtrain
command i get the following exception:The command is: