gligen / GLIGEN

Open-Set Grounded Text-to-Image Generation
MIT License
1.91k stars 145 forks source link

Great Work!!! Few queries regarding the dataset preparation #39

Open VIROBO-15 opened 1 year ago

VIROBO-15 commented 1 year ago

When I am trying to merge the tsv files for Flickr. I am getting the error.

File "/scratch/project_462000189/anwer/mn/GLIGEN/tsv_split_merge.py", line 324, in merge(args.merge_in_folder, args.merge_out_folder) File "/scratch/project_462000189/anwer/mn/GLIGEN/tsv_split_merge.py", line 288, in merge for idx in range(len(reader)): File "/scratch/project_462000189/anwer/mn/GLIGEN/tsv_split_merge.py", line 137, in len return self.num_rows() File "/scratch/project_462000189/anwer/mn/GLIGEN/tsv_split_merge.py", line 102, in num_rows self._ensure_lineidx_loaded() File "/scratch/project_462000189/anwer/mn/GLIGEN/tsv_split_merge.py", line 145, in _ensure_lineidx_loaded self._lineidx = [int(line) for line in lines] File "/scratch/project_462000189/anwer/mn/GLIGEN/tsv_split_merge.py", line 145, in self._lineidx = [int(line) for line in lines] ValueError: invalid literal for int() with base 10: '8760\t{"data_id": 8760, "image": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQgJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyM

Can you please help in this

lmm077 commented 11 months ago

Have you fix the problem? I meet this too.

shizhouxing commented 10 months ago

These lines assume that lineidx_files are in the first half while tsv_files are in the second half, in the results from os.listdir, which doesn't necessarily hold. The file names should be checked.

https://github.com/gligen/GLIGEN/blob/f0ede1e5dc9e5f710fd564da297a3c1ba71a20b0/tsv_split_merge.py#L271-L272

RanJason-Code commented 10 months ago

first: files=sorted(os.listdir(merge_in_folder))

second: lineidx_files = [file for file in files if file.endswith(".lineidx")] tsv_files = [file for file in files if file.endswith(".tsv")]