lucidrains / deep-daze

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun
MIT License
4.37k stars 327 forks source link

Dealing With Punctuation #65

Closed afiaka87 closed 3 years ago

afiaka87 commented 3 years ago

Filenames need to be sanitized of any punctuation before being saved. Any '.' characters, for instance, will be rendered unreadable by IPython.display, although they do manage to save on Linux. I don't think this will be the case on Windows though.

Anyway, I'd like to use punctuation as apparently it is considered by CLIP, but the filenames should be cleaned of it.

import string, re
def underscorify(value):
  no_punctuation = str(value.translate(str.maketrans('', '', string.punctuation)))
  spaces_to_one_underline = re.sub(r'[-\s]+', '_', no_punctuation).strip('-_') # strip gets rid of leading or trailing underscores
  return spaces_to_one_underline

This is a variation on the method django uses to create its stubs from text phrases. Strips all punctuation and converts any amount of spaces to a single underscore.

This will overwrite phrases' files using the same ordering of words but different punctuation. Would need to add in collision detection and append a 'num_collision' count to the file upon collision.