sanskrit-lexicon / csl-corrections

Replacement for sanskrit-lexicon/CORRECTIONS. User corrections to sanskrit-lexicon/csl-orig
GNU General Public License v3.0
0 stars 0 forks source link

Find English errors #14

Closed drdhaval2785 closed 2 years ago

drdhaval2785 commented 4 years ago

Based on an error report aky->sky, I thought that there may be multiple such English errors in our dictionaries. I am trying to write a script to find out the potentially erroneous words and their suggested resolutions.

Following is a sample from MD.

Each line is having the following structure lnum:headword:commaSeparatedError

EDIT - as on 01 Jan 2021, the error files are now stored at https://github.com/sanskrit-lexicon/CORRECTIONS/tree/master/english_error/output

drdhaval2785 commented 4 years ago

Please note that the suggestions are not checked. They are autogenerated from pyenchant library. No adjustments made from our side.

funderburkjim commented 4 years ago

Based on your examples, looks like a good filter.

Will need human intervention before final changes. Maybe @sanskritisampada can help.

drdhaval2785 commented 4 years ago

Will place the script when I have made it work across all dictionaries. Maybe this weekend.

drdhaval2785 commented 4 years ago

EDIT 01 Jan 2020 - the error files are now stored at https://github.com/sanskrit-lexicon/CORRECTIONS/tree/master/english_error/output

The below message is of old data. Kindly ignore the same.

I am putting the text files here for cursory look. As the suggestion was making the work too slow, I have done away with suggestions. It does not seem to be a major impediment.

acc_error.txt ae_error.txt ap90_error.txt ben_error.txt bhs_error.txt bor_error.txt cae_error.txt gst_error.txt ieg_error.txt inm_error.txt mci_error.txt md_error.txt mw72_error.txt mwe_error.txt mw_error.txt pe_error.txt pgn_error.txt pui_error.txt shs_error.txt snp_error.txt vei_error.txt wil_error.txt yat_error.txt

gasyoun commented 4 years ago

Will place the script when I have made it work across all dictionarie

Egaerly waiting.

224:agra:excelPage
229:agre:beforein
260:aMkuSaH:higgle

Opened ap90_error and can't get the idea what is to be corrected.

sanskritisampada commented 4 years ago

Sure !

Sent from my iPhone

On 19 Jun 2020, at 23:23, Mārcis Gasūns notifications@github.com wrote:

Will place the script when I have made it work across all dictionarie

Egaerly waiting.

224:agra:excelPage 229:agre:beforein 260:aMkuSaH:higgle Opened ap90_error and can't get the idea what is to be corrected.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

gasyoun commented 4 years ago

@sanskritisampada so good to see you back.

sanskritisampada commented 4 years ago

Hey hi there! I've been working with Jim on the Lanman corrections past few months.

Hope everyone is doing well amidst the Corona crisis.

Regards.

Sent from my iPhone

On 21 Jun 2020, at 18:55, Mārcis Gasūns notifications@github.com wrote:

@sanskritisampada so good to see you back.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

gasyoun commented 4 years ago

working with Jim on the Lanman corrections past few months

Oh, nobody was aware of that other than Jim :)

gasyoun commented 3 years ago

I am putting the text files here for cursory look.

@sanskritisampada and @AnnaRybakovaT are doing great work based on your scripting in summer, thanks. So not in vain!

drdhaval2785 commented 3 years ago

I have missed a lot of fun in the recent past. How many dictionaries remain to be done @funderburkjim ?

gasyoun commented 3 years ago

I have missed a lot of fun in the recent past.

Oh sure you have. Sampada and Jim are systematic on them, although lately it's become a bit slower.

sanskritisampada commented 3 years ago

Hi Dhaval... We have only Yates to go...

On Sat, 31 Jul 2021, 13:58 Dr. Dhaval Patel, @.***> wrote:

I have missed a lot of fun in the recent past. How many dictionaries remain to be done @funderburkjim https://github.com/funderburkjim ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sanskrit-lexicon/csl-corrections/issues/14#issuecomment-890337588, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTKJX3UFMCZ3DNICNYY5QDT2PQPPANCNFSM4OBDZZ6Q .

funderburkjim commented 3 years ago

I also think that only Yates is to go, but we'll do a confirmation of all the output files to be sure.

drdhaval2785 commented 3 years ago

With #73 done, I think we have completed this exercise. Time to close once @funderburkjim undertakes confirmation of all output files for making sure that we have not missed anything.

sanskritisampada commented 3 years ago

Checked the files. Working to complete Benfey which was pending. Also a few cases in other dictionaries seem to have been left out. Coordinating with Jim on those. List will soon be complete!

Sampada

On Tue, 17 Aug 2021, 06:51 Dr. Dhaval Patel, @.***> wrote:

With #73 https://github.com/sanskrit-lexicon/csl-corrections/issues/73 done, I think we have completed this exercise. Time to close once @funderburkjim https://github.com/funderburkjim undertakes confirmation of all output files for making sure that we have not missed anything.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sanskrit-lexicon/csl-corrections/issues/14#issuecomment-899991285, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTKJX46OWXBM63NPIURDXLT5HTE5ANCNFSM4OBDZZ6Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

funderburkjim commented 2 years ago

I think we're done.