AlexKuhnle / ShapeWorld

MIT License
58 stars 18 forks source link

[BUG] Reading dataset with alternatives infinite loop #28

Closed gstoica27 closed 3 years ago

gstoica27 commented 3 years ago

Hello!

I believe I've found the following bug in the dataset.py file under "ShapeWorld/shapeworld/dataset.py":

When reading a command-line generated dataset with alternatives (e.g. worlds_per_instance), the program encounters an infinite loop. Below I have placed the "offending" code. These are found on lines 563-583 of "dataset.py".

while True:
  if alts:
      i = 0
      v = list()
      while True:
          image_bytes = read_file('{}-{}-{}.{}'.format(value_name, n, i, image_format), binary=True)
          if image_bytes is None:
              break
          image_bytes = BytesIO(image_bytes)
          image = Image.open(image_bytes)
          v.append(World.from_image(image))
          i += 1
      value.append(v)
  else:
      image_bytes = read_file('{}-{}.{}'.format(value_name, n, image_format), binary=True)
      if image_bytes is None:
          break
      image_bytes = BytesIO(image_bytes)
      image = Image.open(image_bytes)
      value.append(World.from_image(image))
  n += 1

The problem is that while the "if image_bytes is None" exits the first while statement as there is a read attempt on a non-existent world, it does not exit the top-level while statement. Consequently, the code will attempt to read non-existent files infinitely.

However, I believe the fix is very simple. For instance, the following seems sufficient:

flag = True
while flag:
  if alts:
      i = 0
      v = list()
      while True:
          image_bytes = read_file('{}-{}-{}.{}'.format(value_name, n, i, image_format), binary=True)
          if image_bytes is None:
              flag = False
              break
          image_bytes = BytesIO(image_bytes)
          image = Image.open(image_bytes)
          v.append(World.from_image(image))
          i += 1
      value.append(v)
  else:
      image_bytes = read_file('{}-{}.{}'.format(value_name, n, image_format), binary=True)
      if image_bytes is None:
          break
      image_bytes = BytesIO(image_bytes)
      image = Image.open(image_bytes)
      value.append(World.from_image(image))
  n += 1

I.e., replacing the top-level "True" with a flag boolean variable, and updating the variable in the corresponding break conditional.

AlexKuhnle commented 3 years ago

You're right, that makes sense (after all, in the else case the loop is correctly terminated as you suggest). Would you mind creating a PR for this fix?

gstoica27 commented 3 years ago

Sure, np. I just made one here.

AlexKuhnle commented 3 years ago

Thanks for this! Closing now.