common-voice / commonvoice-fr

Tooling for producing French dataset for Common Voice
100 stars 24 forks source link

Fix `best_dev_checkpoint` path #167

Open wasertech opened 1 year ago

wasertech commented 1 year ago

Produced checkpoints are using absolute path where they should really be using relative ones.

Checkpoints FR STT 0.9

# ./best_dev_checkpoint
model_checkpoint_path: "/mnt/checkpoints/best_dev-221133"
all_model_checkpoint_paths: "/mnt/checkpoints/best_dev-221133"

Here is English checkpoint for reference.

# ./best_dev_checkpoint
model_checkpoint_path: "best_dev-3663881"
all_model_checkpoint_paths: "best_dev-3663881"

Fix should probably go into package.sh I think.

Related : https://github.com/coqui-ai/STT/issues/2338 ^ STT's lm_optimizer throws a NotFoundError from TF, since the path points to nowhere in transfer-learning configuration, as checkpoint path is /transfer-checkpoint/ instead of /mnt/checkpoints/. https://github.com/common-voice/commonvoice-fr/blob/5699e59244d14bb14d5b7603b91c934b761c9194/DeepSpeech/CONTRIBUTING.md?plain=1#L76-L78 https://github.com/common-voice/commonvoice-fr/blob/5699e59244d14bb14d5b7603b91c934b761c9194/DeepSpeech/train.sh#L14-L17

That was for DS but with STT, we have access to --load_checkpoint_dir and --save_checkpoint_dir flags instead.

    if [ -f "/transfer-checkpoint/checkpoint" -a ! -f "/mnt/models/output_graph.tflite" ]; then
        echo "Using checkpoint from ${TRANSFER_CHECKPOINT}"
        # use --load_checkpoint_dir for transfer learning
        LOAD_CHECKPOINT_FROM="--load_checkpoint_dir /transfer-checkpoint --save_checkpoint_dir /mnt/checkpoints"
    else
        LOAD_CHECKPOINT_FROM="--checkpoint_dir /mnt/checkpoints/"
    fi;

Example from my branch stt140-cv9