# python
import collections
cnt = collections.defaultdict(list)
for i in range(1, 5):
# print(i)
with open("{}.txt".format(i), "r") as f:
for line in f:
line = line.strip()
# print(line)
if ".png" == line[-4:]:
k = int(line.split('/')[-1].split(".png")[0])
# print(k)
cnt[k].append(i)
# break
print("#images:", len(cnt)) # 69080
print("check seperated images")
n_seperated = 0
for k in cnt:
if len(cnt[k]) > 1:
n_seperated += 1
# print(k, "in:", cnt[k])
print("#seperated images:", n_seperated) # 914
I download the FFHQ dataset you provided from Google Drive, but it's split into 4 parts:
I tried the methods in [1-5], but failed. The extracted data will raise errors when doing the preprocessing.
As I detected, there are
914
images are seperated in 2 parts by Google Drive when I download: First list the file names in all 4 parts:then find out the seperated images:
How do you solve this problem ?
Thanks