chrplr / openlexicon

Access to lexical databases
Creative Commons Attribution Share Alike 4.0 International
119 stars 19 forks source link

it's not clear how to contribute to Lexique #20

Closed v-gb closed 7 months ago

v-gb commented 11 months ago

Hello,

Ideally, I wanted to just make a pull request with an updated Lexique csv, but AFAICT, Lexique does not actually live in this repository. Is that right?

It would be useful for the root README to say how to concretely contribute fixes to Lexique.

I'm thinking of fixing things like bad phonetics like the u or ° here :

> select ortho, phon, cgram from lexique where ortho in ('télétexte', 'quadruple');
quadruple   kwadRupl    ADJ
quadruple   kwadRupl    NOM
quadruple   kwadRupl    VER
télétexte   teletEks°t  NOM

I'm also wondering (although it's currently theoretical) about whether you'd accept a new column "h aspiré" with a boolean encoded in some way.

chrplr commented 10 months ago

Hi,

Indeed, the databases are not included in the openlexicon github repository (there are too large).

Lexique3 is currently maintained by my colleague Boris New who incorporates some of the proposed corrections. You can contact him at boris.new@univ-smb.fr

You are totally right that I should put this in the README associated to Lexique3. Will do.

Regarding the additional column, this is a good proposal but rather for lexique4 (work in progress). Some old scripts may misbehave if we change the structure of lexique3.

Best regards

v-gb commented 10 months ago

Thanks ! I reached out to your colleague.

Regarding Lexique4, I'm wondering if it's a waste of time to be touching Lexique3 at this point. Do the corrections for Lexique3 also result in improving Lexique4? And what kind of vague timeline do you have in mind for Lexique4: weeks? three months? More than 6 months?

chrplr commented 10 months ago

For Lexique4, I am looking at another text to phonetic transcription software, that will hopefully generate less mistakes. There is no time frame. I hope we can release it this year though.

-- Christophe Pallier (http://www.pallier.org) INSERM Cognitive Neuroimaging Lab (http://www.unicog.org)

On Mon, Jan 8, 2024 at 1:05 PM v-gb @.***> wrote:

Thanks ! I reached out to your colleague.

Regarding Lexique4, I'm wondering if it's a waste of time to be touching Lexique3 at this point. Do the corrections for Lexique3 also result in improving Lexique4? And what kind of vague timeline do you have in mind for Lexique4: weeks? three months? More than 6 months?

— Reply to this email directly, view it on GitHub https://github.com/chrplr/openlexicon/issues/20#issuecomment-1880874366, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALVWMQS3W3ROCX4NRWRYELYNPOI7AVCNFSM6AAAAABBPP6C66VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBQHA3TIMZWGY . You are receiving this because you commented.Message ID: @.***>