Closed hlapin closed 1 year ago
On 23/11/05 02:56PM, Hayim Lapin wrote:
[In case it matters, I am running kraken in a colab environment.]
I am trying to train a recognition model on top of an existing model. GT is in paired pageXML and image files
After initiating ketos train with the following parameters:
-f xml
(same result with-f page
)I get the error:
FileNotFoundError: [Errno 2] No such file or directory: '../kraken/recognition_models/sinai_no_voc_61.gt.txt' # where sinai_no_voc_61 is the model name to retrieve
But if I understand what is supposed to happen, kraken is not supposed to look for training data in
in *.gt.txt
but in the documents specified with the-e
and-t
flags, which in this case are in entirely different directories. Am I missing something very basic here?
Can you show me the whole command you're running? I guess you didn't
put the -f
argument in the right place, otherwise it wouldn't search
for a *.gt.txt file.
!ketos train \
-o trained_recognition_models/{project_name}_{trial} \
-t recognition_training_xml/{project_name}_xml_train.txt \
-e recognition_training_xml/{project_name}_xml_test.txt \
-q early \
-- verbose \
--normalize-whitespace \
--reorder \
-f xml \
-d cuda:0 \
--resize add \
-i {path_to_model}/{recog_model} \
-r 0.0001 \
-B 1 # batchsize
Moving up -f
generates
Error: No training data was provided to the train command. Use `-t` or the `ground_truth` argument.
Sorry, stupid mistake.
Moving up -f
results in:
[11/06/23 15:00:18] WARNING Parsing recognition_training_xml/maim_autogr_xml_train.txt
And then quits
On 23/11/06 07:01AM, Hayim Lapin wrote:
!ketos train \ -o trained_recognition_models/{project_name}_{trial} \ -t recognition_training_xml/{project_name}_xml_train.txt \ -e recognition_training_xml/{project_name}_xml_test.txt \ -q early \ -- verbose \ --normalize-whitespace \ --reorder \ -f xml \ -d cuda:0 \ --resize add \ -i {path_to_model}/{recog_model} \ -r 0.0001 \ -B 1 # batchsize
The issue is the space in -- verbose
. --
individually is a shell
expression stopping argument parsing. So everything after it gets
ignored or rather used as input files.
Error: No such option: -v
or
Error: No such option: --verbose
without -v, --verbose:
[11/06/23 15:53:53] WARNING Could not open file
On 23/11/06 08:43AM, Hayim Lapin wrote:
Error: No such option: -v
or
Error: No such option: --verbose
Yes, sorry the verbose option is on the base command ketos
as it
exists on all subcommands. So:
ketos -v train ....
would be correct.
without -v, --verbose:
[11/06/23 15:53:53] WARNING Could not open file
That is just a warning. Probably an empty line in your manifest file. It should just skip anything that isn't loadable.
You can close this issue (file paths and *.gt.txt), but kraken still does not run.
[11/06/23 19:51:08] INFO Loading existing model from
and then quits after 21 seconds or so
On 23/11/06 11:59AM, Hayim Lapin wrote:
You can close this issue (file paths and *.gt.txt), but kraken still does not load.
[11/06/23 19:51:08] INFO Loading existing model from
and then quits after 21 seconds or so
Hm weird. You can add multiple -v
switches to increase the verbosity.
My most immediate suspicion is that the manifest files still don't point to the right files. The paths in there need to be either absolute or relative to the current location and not the location of the file itself.
In case anyone finds this issue, reporting that the problem was all on my side, setting up the training data directory.
[In case it matters, I am running kraken in a colab environment.]
I am trying to train a recognition model on top of an existing model. GT is in paired pageXML and image files
After initiating ketos train with the following parameters:
-f xml
(same result with-f page
)I get the error:
But if I understand what is supposed to happen, kraken is not supposed to look for training data in
in *.gt.txt
but in the documents specified with the-e
and-t
flags, which in this case are in entirely different directories. Am I missing something very basic here?