Closed Kayliii closed 3 months ago
It may be text file encoding issue.
If you can modify python code, test this fix: https://github.com/KichangKim/DeepDanbooru/blob/05eb3c39b0fae43e3caf39df801615fe79b27c2f/deepdanbooru/data/dataset.py#L6
def load_tags(tags_path):
with open(tags_path, "r") as tags_stream:
tags = [tag for tag in (tag.strip() for tag in tags_stream) if tag]
return tags
to
def load_tags(tags_path):
with open(tags_path, "r", encoding="utf-8") as tags_stream:
tags = [tag for tag in (tag.strip() for tag in tags_stream) if tag]
return tags
The dataset I am using to build the tag database and tags.txt has some letters that deepdanbooru crashes on. Specifically in my case, it does not like the letter
ō
, which produces the following error (abbreviated to show the relevant part):ō
is a single character encoded asc5 8d
, if it gets to8d
without understanding that it's part of a the previous character, something has already gone wrong.