i was facing same issue but solve.....

Brutal15229 commented 5 days ago

          i was facing same issue but solve.....

label_file_list= ratio_list

Originally posted by @jitesh-rathod in https://github.com/PaddlePaddle/PaddleOCR/issues/1996#issuecomment-811715798

Brutal15229 commented 5 days ago

when i am doing this File "/data/data/aditya_llama/Ocr_training/archive/PaddleOCR/ppocr/data/_init.py", line 107, in build_dataloader dataset = eval(module_name)(config, mode, logger, seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/aditya_llama/Ocr_training/archive/PaddleOCR/ppocr/data/simple_dataset.py", line 56, in _init self.data_lines = self.get_image_info_list(label_file_list, ratio_list) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/data/aditya_llama/Ocr_training/archive/PaddleOCR/ppocr/data/simple_dataset.py", line 73, in get_image_info_list lines = random.sample(lines, round(len(lines) * ratio_list[idx])) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: type str doesn't define round method

Brutal15229 commented 5 days ago

pls help me i stuck over here

GreatV commented 4 days ago

It seems you are referencing a PaddleOCR-related issue where the user claimed to have resolved their problem with the following code:

label_file_list = ratio_list

The context of your question seems to suggest that you are encountering similar issues with PaddleOCR, particularly errors like KeyError: 'label' or other data-related discrepancies during processing.

Potential Causes and Solutions:

Error: KeyError: 'label'
- Cause: This error typically occurs when the data loader or transformation pipeline is unable to find the label key in the input data or annotation file. This might happen due to:
  - Improperly formatted dataset annotations.
  - Missing label fields in the dataset JSON or CSV files.
  - Incorrect transforms or keys being used in the configuration.
- Solution:
  - Check your dataset annotation files to ensure that each entry contains a valid label key. For example:
```
{
"transcription": "example text",
"points": [[x1, y1], [x2, y2], [x3, y3], [x4, y4]],
"label": "class_name"
}
```
    If the label field is missing, you need to add it manually or modify your dataset generation script to include it.
  - Ensure that your KeepKeys transform in the configuration includes label if it's being used in the pipeline:
```
KeepKeys:
keep_keys: ['image', 'label', 'shape', ...]
```
  - Update the VQATokenLabelEncode or other encoding transforms to correctly map the label field if required.
Misalignment of label_file_list and ratio_list
- Cause: During training, label_file_list and ratio_list are used together to load and balance datasets. If these lists are not aligned in length or content, errors can occur.
- Solution:
  - Ensure that the label_file_list in your configuration corresponds to the correct file paths of your dataset annotations.
  - Verify that ratio_list contains the same number of entries as label_file_list and that the ratios are correctly set (e.g., [1.0] for a single dataset or [0.7, 0.3] for train/validation splits).
Error in table/predict_table.py
- Cause: If you are working with table recognition, errors can occur due to improperly structured input data or configuration mismatches.
- Solution:
  - Ensure that the table annotation files are in the correct format required by PaddleOCR.
  - If the error occurs during prediction, check the predict_table.py script and ensure that the input image paths and preprocessing steps are configured correctly.
General Debugging Approach:
- Enable debug logs by setting show_log=True in the PaddleOCR configuration or script.
- Run the script with a smaller subset of your dataset to isolate problematic entries.
- Validate your data pipeline step by step (e.g., loading images, checking annotations, running transforms).

Key Resources:

Relevant Issues and Discussions:
Suggestions from the User's Post: The snippet label_file_list = ratio_list might imply that the user manually assigned the label_file_list variable to resolve an alignment or configuration mismatch. Ensure that both variables are correctly defined in your configuration YAML file.

If you can provide more details about your specific issue (e.g., errors, configurations, or dataset structure), I can provide a more targeted solution!

Response generated by 🤖 feifei-bot | chatgpt-4o-latest

PaddlePaddle / PaddleOCR

i was facing same issue but solve..... #14287

Potential Causes and Solutions:

Key Resources: