Open mitchblank opened 4 years ago
I wouldn't trust a lot this dataset, there is no info how he generated it. There is some dataset out there created by a trusted French institute, but I forgot the source right now. It would just need some reformatting.
The french-verb-conjugation.csv file has places where the same infinitive (column 1) appears multiple times in the table:
There seem to be multiple sources of this.
First, there are 4 cases where 100% identical lines appear in the CSV file:
Those are easy to ignore.
The remaining ones are cases where the same infinitive appears, but the rest of the verbs include a prefix. For example, there is a normal entry for
pouvoir
but another line that is the entry I would expect forrepouvoir
:There doesn't seem to be a normal entry for
repouvior
in the CSV, so it seems that the prefix just is stripped from the infinitive form?That pattern seems to hold for four of the other duplicated infinitives (including
moudre
which appears as an infinitive 3 times)clore
->forclore
éclore
->déclore
moudre
->émoudre
,remoudre
pouvior
->repouvior
I am far from a native French speaker, but this looks strange to me. Other verbs with prefixes have the infinitive prefixed as well (there are over 300
re-
infinitives in the file, for example) so I don't know why these repeat the infinitive.Then there are 6 other cases where an infinitive appears twice with actually different data:
accroître
décroître
départir
faillir
parfumer
ressortir
Again, I don't know enough French to say whether these entries are correct (in the sense that there are two distinct verb conjugations -- perhaps for reflexive vs non-reflexive use?) However, takeaccroître
as an example. The two entries in the CSV file basically differ on whether the past-participle isaccru
oraccrû
. This appears to just be a genuine spelling controversy (the recent dictionaries I checked all give it asaccrû
, but I have a 1972 copy of Harrap's which lists it asaccru
) Other entries in the CSV file indicate this with a semicolon-delimited list of alternatives (e.g.paye;paie
)