ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.21k stars 16.21k forks source link

how to change lebels to Korean? #7460

Closed AlvinPark09 closed 2 years ago

AlvinPark09 commented 2 years ago

Search before asking

Question

hello there. I'm new to yolov5 and trying to change labels to Korean. I also edited names with Korean in data.yaml but result came out with squares not Korean. I know it is a problem with fonts but couldn't find where to modify. thanks

test3

Additional

No response

github-actions[bot] commented 2 years ago

👋 Hello @AlvinPark09, thank you for your interest in YOLOv5 🚀! Please visit our ⭐ïļ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email support@ultralytics.com.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on macOS, Windows, and Ubuntu every 24 hours and on every commit.

AlvinPark09 commented 2 years ago

this is a list with Korean that i'm trying to show

names = ['ėŒ€ë°Ĩ', 'Multigrain Rice', 'ė―Đë°Ĩ'] which means ['Rice', 'Multigrain Rice', 'Rice with beans']

glenn-jocher commented 2 years ago

@AlvinPark09 does the Korean print correctly in the terminal window when printing your detections?

AlvinPark09 commented 2 years ago

Dear. @glenn-jocher sorry to bother. I just found a solution to change labels to Korean from old issuse.

from plot.py, I bolded( marked) where I modified for those who are looking for a solution about same issue just change FONT_PATH to your own path contains a font that support to print Korean

class Annotator:
    if RANK in (-1, 0):
        check_pil_font()  # download TTF if necessary

    # YOLOv5 Annotator for train/val mosaics and jpgs and detect/hub inference annotations
    def __init__(self, im, line_width=None, font_size=None, **font=FONT_PATH**, pil=False, example='abc'):
        assert im.data.contiguous, 'Image not contiguous. Apply np.ascontiguousarray(im) to Annotator() input images.'
        self.pil = pil or not is_ascii(example) or is_chinese(example)
        if self.pil:  # use PIL
            self.im = im if isinstance(im, Image.Image) else Image.fromarray(im)
            self.draw = ImageDraw.Draw(self.im)
            self.font = check_pil_font(**font=FONT_PATH** if is_chinese(example) else font,
                                       size=font_size or max(round(sum(self.im.size) / 2 * 0.035), 12))
        else:  # use cv2
            self.im = im
        self.lw = line_width or max(round(sum(im.shape) / 2 * 0.003), 2)  # line width
   def box_label(self, box, label='', color=(128, 128, 128), txt_color=(255, 255, 255)):
    # Add one xyxy box to image with label
    if self.pil or not is_ascii(label):
        self.draw.rectangle(box, width=self.lw, outline=color)  # box
        if label:
            w, h = self.font.getsize(label)  # text width, height
            outside = box[1] - h >= 0  # label fits outside box
            self.draw.rectangle([box[0],
                                 box[1] - h if outside else box[1],
                                 box[0] + w + 1,
                                 box[1] + 1 if outside else box[1] + h + 1], fill=color)
            # self.draw.text((box[0], box[1]), label, fill=txt_color, font=self.font, anchor='ls')  # for PIL>8.0
    **self.font_MYFONT = ImageFont.truetype("FONT_PATH", 40)**
    self.draw.text((box[0], box[1] - h if outside else box[1]), label, fill=txt_color, **font=self.font_MYFONT**)

   def box_label(self, box, label='', color=(128, 128, 128), txt_color=(255, 255, 255)):
    # Add one xyxy box to image with label
    if self.pil or not is_ascii(label):
        self.draw.rectangle(box, width=self.lw, outline=color)  # box
        if label:
            w, h = self.font.getsize(label)  # text width, height
            outside = box[1] - h >= 0  # label fits outside box
            self.draw.rectangle([box[0],
                                 box[1] - h if outside else box[1],
                                 box[0] + w + 1,
                                 box[1] + 1 if outside else box[1] + h + 1], fill=color)
            # self.draw.text((box[0], box[1]), label, fill=txt_color, font=self.font, anchor='ls')  # for PIL>8.0
    **self.font_MYFONT = ImageFont.truetype("FONT_PATH", 40)**
    self.draw.text((box[0], box[1] - h if outside else box[1]), label, fill=txt_color, **font=self.font_MYFONT**)

result image is attached for ref.

test3

glenn-jocher commented 2 years ago

@AlvinPark09 thanks for the suggested solution!

I used your class name and trained a model and I am able to reproduce your issue. The name prints correctly in the console but does not plot correctly with PIL.

Screen Shot 2022-04-19 at 4 20 53 PM

So it seems that the Arial.Unicode.ttf font we use for Chinese characters is not working with your Korean characters? This line is supposed to detect special characters with is_chinese(str) and use Arial.Unicode.ttf instead of Arial.ttf: https://github.com/ultralytics/yolov5/blob/d876caab4d8f54d11988c277eb2a237bbe405841/utils/plots.py#L79-L80

glenn-jocher commented 2 years ago

@AlvinPark09 ok I've tracked down the problem. The is_chinese() function is not correctly identifying the characters as non-latin. To reproduce:

from utils.general import is_chinese

is_chinese('ėŒ€ë°Ĩ')
False
glenn-jocher commented 2 years ago

@AlvinPark09 good news 😃! Your original issue may now be fixed ✅ in PR #7488. This PR generalizes Annotator to use Unicode fonts when ANY non-latin characters are detected (not just Chinese characters). This also adds a check before training starts to pre-download any required fonts.

The resulting training now correctly plots labels for your Korean fonts use-case:

val_batch0_labels

To receive this update:

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!

AlvinPark09 commented 2 years ago

@glenn-jocher thx for your all efforts.

caramelpop commented 1 year ago

Hello, @glenn-jocher and @AlvinPark09

I was researching to change my Yolov5 label to Japanese and came across this article.

When I was changing plots.py as per this article, I got the error OSError: cannot open resource and was told that there was no .ttf file I had set in FONT_PATH.

How did you set the FONT_PATH? I am running on Windows, using anaconda, and local.

Here is the plots.py I changed

FONT_PATH = Path('C:\Windows\Fonts\yumin.ttf')

class Annotator:
    # YOLOv5 Annotator for train/val mosaics and jpgs and detect/hub inference annotations
    def __init__(self, im, line_width=None, font_size=None, font=FONT_PATH, pil=False, example='abc'):
        assert im.data.contiguous, 'Image not contiguous. Apply np.ascontiguousarray(im) to Annotator() input images.'
        non_ascii = not is_ascii(example)  # non-latin labels, i.e. asian, arabic, cyrillic
        self.pil = pil or non_ascii
        if self.pil:  # use PIL
            self.im = im if isinstance(im, Image.Image) else Image.fromarray(im)
            self.draw = ImageDraw.Draw(self.im)
            self.font = check_pil_font(font=FONT_PATH if non_ascii else font,
                                       size=font_size or max(round(sum(self.im.size) / 2 * 0.035), 12))
        else:  # use cv2
            self.im = im
        self.lw = line_width or max(round(sum(im.shape) / 2 * 0.003), 2)  # line width

    def box_label(self, box, label='', color=(128, 128, 128), txt_color=(255, 255, 255)):
        # Add one xyxy box to image with label
        if self.pil or not is_ascii(label):
            self.draw.rectangle(box, width=self.lw, outline=color)  # box
            if label:
                w, h = self.font.getsize(label)  # text width, height
                outside = box[1] - h >= 0  # label fits outside box
                self.draw.rectangle(
                    (box[0], box[1] - h if outside else box[1], box[0] + w + 1,
                     box[1] + 1 if outside else box[1] + h + 1),
                    fill=color,
                )
                # self.draw.text((box[0], box[1]), label, fill=txt_color, font=self.font, anchor='ls')  # for PIL>8.0
                #self.draw.text((box[0], box[1] - h if outside else box[1]), label, fill=txt_color, font=self.font)
                self.font_MYFONT = ImageFont.truetype('FONT_PATH', 40)
                self.draw.text((box[0], box[1] - h if outside else box[1]), label, fill=txt_color, font=self.font_MYFONT)
        else:  # cv2

Here is the error text

Traceback (most recent call last):
  File "C:\Users\admin\anaconda3\envs\yolov5\lib\threading.py", line 980, in _bootstrap_inner
    self.run()
  File "C:\Users\admin\anaconda3\envs\yolov5\lib\threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\admin\Desktop\v5_jp\utils\plots.py", line 309, in plot_images
    annotator.box_label(box, label, color=color)
  File "C:\Users\admin\Desktop\v5_jp\utils\plots.py", line 102, in box_label
    self.font_MYFONT = ImageFont.truetype('FONT_PATH', 40)
  File "C:\Users\admin\anaconda3\envs\yolov5\lib\site-packages\PIL\ImageFont.py", line 959, in truetype
    return freetype(font)
  File "C:\Users\admin\anaconda3\envs\yolov5\lib\site-packages\PIL\ImageFont.py", line 956, in freetype
    return FreeTypeFont(font, size, index, encoding, layout_engine)
  File "C:\Users\admin\anaconda3\envs\yolov5\lib\site-packages\PIL\ImageFont.py", line 247, in __init__
    self.font = core.getfont(
OSError: cannot open resource

Also, when I changed the contents of the data.yaml I created to Japanese, the label is not displayed correctly in the Anaconda Prompt and is blank.

names: ['ãēãĻ', '臩čŧĒčŧŠ', 'car']which means['person', 'bicycle', 'car']

Class     Images  Instances          P          R      mAP50   mAP50-95: 100%|█
 all         40        782   0.000353     0.0331   0.000419    7.5e-05
             40         99          0          0          0          0
             40          6          0          0          0          0
 car         40         26          0          0          0          0

Is there any solution to this problem? Please help me. Thank you very much.

glenn-jocher commented 1 year ago

@caramelpop korean and other fonts like arabic, cyrillic are all supported in image annotations using PIL, but not all consoles can print these correctly, so your korean labels should work correctly in YOLOv5 with no changes, but like you've seen may not appear everywhere in console printouts and matplotlib printouts.

caramelpop commented 1 year ago

@glenn-jocher Thanks for replying.

,but not all consoles can print these correctly,

Understood.

So what should I do about OSError?

glenn-jocher commented 11 months ago

@caramelpop You should be able to use the fonts by specifying the FONT_PATH as the font file's absolute path like FONT_PATH = 'C:\Windows\Fonts\yumin.ttc' using double-backslashes '\' or with forward slashes '/' to reference the font file directly. Also make sure the font file is passed in the correct method signature.

As for the data.yaml file, you'll also have to configure your console to display Japanese characters correctly. This can be a bit system-specific, so you may need to look up how to enable Japanese language settings in your specific OS.