Closed AnANTKA closed 1 year ago
I used the cell to download images from the booru sites and had bleach as the tag and it downloaded about 4,000 images all with the tags
I just created a script to check if the image has a corresponding text file and if it does not then it skips it from adding it to the json file
import os
import json
from tqdm import tqdm
data = {}
path = "D:\\bleach-20230213T052017Z-002\\bleach"
files = os.listdir(path)
for filename in tqdm(files):
if filename.endswith(".png") or filename.endswith(".jpg"):
image_filename, image_extension = os.path.splitext(filename)
text_file = image_filename + ".txt"
if text_file in files:
with open(os.path.join(path, text_file), encoding='utf-8') as f:
tags = f.read()
data[image_filename] = {"tags": tags}
with open("output.json", "w") as outfile:
json.dump(data, outfile)
the fact that I had to do this is a little annoying but at least it solved my problem You should see if you can implement this check in some way so anyone who downloads from any booru websites and gets the tags automatically doesn't have to go through the same thing I had to
merge tags to metadata json. 0% 0/4669 [00:00<?, ?it/s] Traceback (most recent call last): File "merge_dd_tags_to_metadata.py", line 62, in
main(args)
File "merge_dd_tags_to_metadata.py", line 30, in main
tags = tags_path.read_text(encoding='utf-8').strip()
File "/usr/lib/python3.8/pathlib.py", line 1236, in read_text
with self.open(mode='r', encoding=encoding, errors=errors) as f:
File "/usr/lib/python3.8/pathlib.py", line 1222, in open
return io.open(self, mode, buffering, encoding, errors, newline,
File "/usr/lib/python3.8/pathlib.py", line 1078, in _opener
return self._accessor.open(self, flags, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/MyDrive/Kohya_Training_Data/bleach/safebooru_4219103_56a35eb2881f85bb85172e3d79aae15b.txt'
the file exists but it won't see it Kohya fine tuner not lora