representation generators should not return filenames, because it's not clear to the user what happened. Right now, representation generators write files which are implicitly whitespace tokenized, which are then read by create_contexts. If the file is not whitespace tokenized, the user would get junk output without knowing why.
The way to fix:
representation generators should have flags like 'persist=True', filename='default.ext', which allow us to control where they persist data. Then they can check if the file exists, if it doesn't they can create it, if it does, they should parse it and return the data.
create_contexts should only build the context objects and the experiment data structure, it should not have this logic because this says "I assume that you passed me a list of whitespace-tokenized representations"). All parsing logic should be in the parsers and/or representation generators.
representation generators should not return filenames, because it's not clear to the user what happened. Right now, representation generators write files which are implicitly whitespace tokenized, which are then read by create_contexts. If the file is not whitespace tokenized, the user would get junk output without knowing why.
The way to fix: