PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.05k stars 7.81k forks source link

where is input_path #11866

Closed javachens closed 6 months ago

javachens commented 7 months ago

Where does the input path in the following command come from? The path was not found in the code and is not described in the documentation

将官网下载的标签文件转换为 train_icdar2015_label.txt

python gen_label.py --mode="det" --root_path="/path/to/icdar_c4_train_imgs/" \ --input_path="/path/to/ch4_training_localization_transcription_gt" \ --output_label="/path/to/train_icdar2015_label.txt"

UserWangZz commented 6 months ago

The data conversion tools path is ppocr/utils/gen_label.py, you can run this cli at path in ppocr/utils the arguments can reference this

parser.add_argument(
    "--mode",
    type=str,
    default="rec",
    help="Generate rec_label or det_label, can be set rec or det",
)
parser.add_argument(
    "--root_path",
    type=str,
    default=".",
    help="The root directory of images.Only takes effect when mode=det ",
)
parser.add_argument(
    "--input_path",
    type=str,
    default=".",
    help="Input_label or input path to be converted",
)
parser.add_argument(
    "--output_label", type=str, default="out_label.txt", help="Output file name"
)