Identify texts already_done on the fly

Currently we first build a list of texts done, then start parsing. As a consequence, if a text is processed while a big list of texts is running, we will reparse it anyway.

When this is done, we could use parse_many in parallel (for instance by year to parallelize the data collection)

The complication is that we have only the url at this moment, so we need to identify the corresponding directory

Bonus : there is a redundant "skip_already_done" option in "format_data_for_frontend" that is never called and should be removed

regardscitoyens / the-law-factory-parser

Identify texts already_done on the fly #104