I believe I've found the following bug in the dataset.py file under "ShapeWorld/shapeworld/dataset.py":
When reading a command-line generated dataset with alternatives (e.g. worlds_per_instance), the program encounters an infinite loop. Below I have placed the "offending" code. These are found on lines 563-583 of "dataset.py".
while True:
if alts:
i = 0
v = list()
while True:
image_bytes = read_file('{}-{}-{}.{}'.format(value_name, n, i, image_format), binary=True)
if image_bytes is None:
break
image_bytes = BytesIO(image_bytes)
image = Image.open(image_bytes)
v.append(World.from_image(image))
i += 1
value.append(v)
else:
image_bytes = read_file('{}-{}.{}'.format(value_name, n, image_format), binary=True)
if image_bytes is None:
break
image_bytes = BytesIO(image_bytes)
image = Image.open(image_bytes)
value.append(World.from_image(image))
n += 1
The problem is that while the "if image_bytes is None" exits the first while statement as there is a read attempt on a non-existent world, it does not exit the top-level while statement. Consequently, the code will attempt to read non-existent files infinitely.
However, I believe the fix is very simple. For instance, the following seems sufficient:
flag = True
while flag:
if alts:
i = 0
v = list()
while True:
image_bytes = read_file('{}-{}-{}.{}'.format(value_name, n, i, image_format), binary=True)
if image_bytes is None:
flag = False
break
image_bytes = BytesIO(image_bytes)
image = Image.open(image_bytes)
v.append(World.from_image(image))
i += 1
value.append(v)
else:
image_bytes = read_file('{}-{}.{}'.format(value_name, n, image_format), binary=True)
if image_bytes is None:
break
image_bytes = BytesIO(image_bytes)
image = Image.open(image_bytes)
value.append(World.from_image(image))
n += 1
I.e., replacing the top-level "True" with a flag boolean variable, and updating the variable in the corresponding break conditional.
You're right, that makes sense (after all, in the else case the loop is correctly terminated as you suggest). Would you mind creating a PR for this fix?
Hello!
I believe I've found the following bug in the dataset.py file under "ShapeWorld/shapeworld/dataset.py":
When reading a command-line generated dataset with alternatives (e.g. worlds_per_instance), the program encounters an infinite loop. Below I have placed the "offending" code. These are found on lines 563-583 of "dataset.py".
The problem is that while the "if image_bytes is None" exits the first while statement as there is a read attempt on a non-existent world, it does not exit the top-level while statement. Consequently, the code will attempt to read non-existent files infinitely.
However, I believe the fix is very simple. For instance, the following seems sufficient:
I.e., replacing the top-level "True" with a flag boolean variable, and updating the variable in the corresponding break conditional.