marcoagpinto / aoo-mozilla-en-dict

English Dictionaries Project (AOO+Mozilla+others)
159 stars 24 forks source link

So and so addition and potential errors #28

Closed Ding-adong closed 5 years ago

Ding-adong commented 5 years ago

Aboriginal/SM African-American/S BitInstant CD/S Cyclopean able/nVvYNTDP add DP so you can remove abled ableness accelerant/S allocute/SD allotriophagy amnio/S anthropomancy aphrodisiacal/S apple-cart arachnophobe assemblance asshole/S assimilator/S astrometric/S attaboy bachelorette/SM backswing/S badass/S baldie bicep/S biohazardous biomechanoid bioscan blindsided blowjob/S bodge/SRD boondoggle brassic bustier/S capsize/SD check-up/S childbed clasp/SUGDM coffee-house cohabitate cool-bag counter-insurgency defensive/IS device/S - noun dumpster fasciitis first-born flatline gasohol goddamn hydrobromide infomercial intel leucotome leucotomies life-force mine-detector necrotise offensive/IYPSM osteoplasty post-mortem postdate/GDS pretension/S radiosterilise/Dn scotograph seize/BD shake-down sophomore staff/ADGSR stenograph/SGDZ sub-editorship tarp taser/Sd tasseography terawatt thoron tiddler/S tiffin tikka tod toke tosspot tox tuxedo visual/sYQSq3 voiceprint wack waggon wakey weaverbird/S welch wetback/S whoremonger windshield wish-list world-view xenoanthropology xenobiology yip/S zit/S

indecenter not a valid word depersonzlised potential spelling mistake

Why don't you use /Y instead of two words?

There were many duplicate lines which i have removed.

marcoagpinto commented 5 years ago

Hello!

Thank you for the word suggestions:

Comments: CD/SM (already in the speller) aphrodisiacal (it is an adjective - it doesn't have plural) apple-cart (both Oxford and Collins don't use hyphen) asshole (both Oxford and Collins say it is US, GB uses "arsehole") astrometric (it is an adjective - it doesn't have plural) bachelorette (Oxford says it is US) badass (Oxford says it is US) blindsided (Oxford and Collins say it is US) boondoggle (Oxford says it is US) check-up (already in the speller) cool-bag (both Oxford and Collins don't use hyphen) device/S - noun ( it is already in the speller although not as "device" but as "vice" with prefix. dumpster (American)

Haven't found "depersonzlised" and removed "indecenter".

I haven't had the chance to go through the entire list and I will be gone in the weekend. I will resume on Monday.

The reason why I didn't use /Y instead of two words is because I am a lazy arse. When I am trying to add as many words as possible in a single day, most of the times I add the words individually. I know it is not the correct thing to do but one day I will go through the whole .dic and try to improve it.

Also, there are THOUSANDS of duplicates (legacy). I am planning to implement a feature into Proofing Tool GUI that will merge the flags, but there are other priority tasks first, such as implementing a feature that will allow to add and remove words from a list for AU+CA+NZ spellers (so that we can find maintainers for those variants based on my speller - with country specific words removing GB words).

Here is what I have been able to change for now: 4-JAN-2019 Suggested on GitHub by Ding-adong: 42587) Aboriginal (added singular) 42588) African-American (+plural +'s - Collins) 42589) BitInstant (+'s - Wikipedia) 42590) Cyclopean (uppercase - Collins) 42591) able/nVvYNTDP (added DP to remove abled + ableness) 42592) accelerant (+plural +'s) 42593) allocute (+s +ing +ed) 42594) allotriophagy (Wiktionary) 42595) amnio (+plural +'s) 42596) anthropomancy 42597) aphrodisiacal (it is an adjective - it doesn't have plural) 42598) arachnophobe 42599) assemblance 42600) assimilator (+plural +'s) 42601) astrometric (it is an adjective - it doesn't have plural) 42602) attaboy 42603) backswing (+plural +'s) 42604) baldie (+plural +'s) 42605) bicep (added singular) 42606) biohazardous 42607) biomechanoid (+plural +'s - Wiktionary) 42608) bioscan (+plural +'s - Wiktionary) 42609) blowjob (+plural +'s - Wiktionary) 42610) bodge (+s +ing +ed +er +ers) 42611) brassic 42612) bustier (added singular +'s) 42613) capsize (+s +ing +ed +'s) 42614) childbed 42615) unclasp's + clasp's (merged into clasp) 42616) coffee-house (+plural +'s - Collins) 42617) cohabitate (+s +ing +ed - Wiktionary) 42618) counter-insurgency 42619) defensive + indefensive (it is an adjective - it doesn't have plural) 42620) fasciitis 42621) first-born (another way of firstborn - Collins) 42622) flatline (+s +ing +ed +er +ers) 42623) indecenter (REMOVED - TYPO)

On Monday I will continue.

Thank you for your suggestions.

Kind regards,

marcoagpinto commented 5 years ago

Comments: gasohol (US) goddamn (US) leucotomies (already in speller) mine-detector (both Oxford and Collins don't have an hyphen) radiosterilise (doesn't come in Oxford/Collins/Wiktionary) shake-down (both Oxford and Collins don't have an hyphen) sophomore (US) tarp (US) tuxedo (US) wetback (US) windshield (US) wish-list (both Oxford and Collins don't have an hyphen)

I got up early in the morning to do the last of the list before my job: 5-JAN-2019 Suggested on GitHub by Ding-adong: 42624) hydrobromide (+plural +'s - Wiktionary) 42625) infomercial (+plural +'s - American, but there is no GB similar) 42626) intel 42627) leucotome (Collins) 42628) life-force (Collins) 42629) necrotize (+s +ing +ed -IZE) 42630) necrotise (+s +ing +ed -ISE) 42631) offensive/IYPSM (merged flags) 42632) osteoplasty 42633) post-mortem 42634) postdate (+s +ing +ed - Collins) 42635) pretension (+plural +less) 42636) scotograph (+plural +'s - Wiktionary) 42637) seize (+s +ing +ed +able +bility) 42638) staff/ADGSR 42639) stenography (merged into stenograph) 42640) sub-editorship 42641) taser (+s +ing +ed) 42642) tasseography (Wiktionary) 42643) terawatt (+plural +'s) 42644) thoron (Collins) 42645) tiddler (+plural +'s) 42646) tiffin 42647) tikka 42648) tod 42649) toke (+s +ing +ed +er +ers) 42650) tosspot (+plural +'s) 42651) tox 42652) visual/8sY-9QSq3 (merged flags) 42653) voiceprint (+plural +'s) 42654) voiceprinter 42655) wack 42656) waggon (+plural +'s) 42657) wakey (Wiktionary) 42658) weaverbird (+plural +'s - Collins) 42659) welsher + welcher (+plural +'s) 42660) welsh + welch (+s +ing +ed) 42661) whoremonger (flag !) 42662) world-view (Collins) 42663) xenoanthropology (Wiktionary) 42664) xenobiology 42665) yip (+s +ing +ed) 42666) zit (+plural +'s +y)

Ding-adong commented 5 years ago

depersonzlized - sorry my software convert ize to ise automatically. It's on line 21292. CD/SM - found it, I looked at the old CDs and put /SM on here as a fix when you already had a fix ha ha.

Is having two // an error ie. Cwm//M ?

I use the dictionary 99% for subtitles and obviously converting from US to GB TV programmes's spelling is important. Thus speed is equally important. I use Subtitle Edit to correct the mistakes subtitle files will have. The less time spent on checking the spelling the better. Thus spelling of words and using US words, are two different issues. Common US words I listed above are used all the time and used in UK programmes too, valid usage in daily life, (movie) regardless of the dictionary and people checking the subtitles files using spellchecker needs speed too. It isn't possible to switch between US and GB dictionary all the time, as this creates more problems due to spelling. When new words and/or words not in dictionary is flagged up, I then click on 'add to user dictionary'. I do not see a problem with having common everyday US words, not spelling, used by GB people, in GB dictionary.

Examples:

On the issue of hyphen, the question is whether it is valid usage or not? Having a hyphen in the above words are valid, regardless of the dictionary. It has common usage and makes spelling checking works faster. There is no harm in adding them to the dictionary.

Device!!! Oh come on, do you expect me and others to look at 'vice' then see if there is a prefix. 'Device' is a word in is own right and should be in the dictionary. One extra line won't hurt. vice/SM device/S

/Y I did the correction last year but the annoying thing is when a new update dictionary is released by you, I make a comparison and add the new words. However it became cumbersome to compare when /Y isn't the same between myself and yours. I will check and put up the correction later.

How do you check for duplicate lines? I use notepad++ to load the dic file and do the correction. Sometimes, I use your Proofing Tool software for double checking. However, when i did a comparison 2 days ago i noticed there was some common words missing. Being a tad suspicious i did some analysis. I use textfx - textfx tools - sort out only unique then sort sensitive. I got suspicious when i noticed about 400 lines where removed. I then did a comparison between old dic and new dic and it turns out that textfx doesn't handle too many lines very well. It removed words when it shouldn't and left duplicate words in. I download some softwares, like duplicateremover, textmechanic - both failed and textcrawler - worked 100% as intended. Do you have any recommendations?

cheers

marcoagpinto commented 5 years ago

Is having two // an error ie. Cwm//M ?

Yes, it is an error.

Thanks for spotting.

I have decided not to go to my weekend job because of my illness.

I spoke with my PhD supervisor who told me to stay at home until I am cured.

I have medicines to take for six days or more.

I will send an e-mail to my British friends relating your previous comment asking for their suggestion.

This means that they can still be added.

My dear brother, I will see what they will say.

Thank you and kind regards,

jorgk3 commented 5 years ago

My 0.001% of wisdom. I'd add the words bulleted above, well, maybe except blindsided and boondoggle which are rather US English. BTW, both in Mozilla's en-US dictionary.

Personally I think it doesn't make much sense to exclude US nouns, you already have truck and movie.

marcoagpinto commented 5 years ago

My 0.001% of wisdom. I'd add the words bulleted above, well, maybe except blindsided and boondoggle which are rather US English. BTW, both in Mozilla's en-US dictionary.

Personally I think it doesn't make much sense to exclude US nouns, you already have truck and movie.

Thanks, Jörg, soon I will add them.

marcoagpinto commented 5 years ago

@Ding-adong

* asshole and arsehole are two different words and not a spelling issue. It would not be wise to change asshole into arsehole since they are pronounced differently.

* bachelorette has common usage.

* badass has common usage.

* blindsided has common usage in past tense. 'blind side' common in GB.

* boondoggle has common usage by economics/journalists in GB.

* dumpster common usage and not wise to convert to 'skip', UK word, since they are pronounced differently.

* and so on... Hence the reason for my suggestion is they are used daily.

* radiosterilise is a definitely a word, medical. The sterilisation of an organism, or a surgical implement, by ionizing radiation such as X-rays or gamma rays.

I will add these words in the bullets except for the two referred by Jörg

On the issue of hyphen, the question is whether it is valid usage or not? Having a hyphen in the above words are valid, regardless of the dictionary. It has common usage and makes spelling checking works faster. There is no harm in adding them to the dictionary.

I always try to find the words in Oxford+Collins+Wiktionary before adding them.

Device!!! Oh come on, do you expect me and others to look at 'vice' then see if there is a prefix. 'Device' is a word in is own right and should be in the dictionary. One extra line won't hurt. vice/SM device/S

Yes, I know. It is "legacy". The maintainers before me did it this way. I don't like to use prefixes flags because it is harder to find if the words are in the .dic.

How do you check for duplicate lines? I use notepad++ to load the dic file and do the correction. Sometimes, I use your Proofing Tool software for double checking. However, when i did a comparison 2 days ago i noticed there was some common words missing. Being a tad suspicious i did some analysis. I use textfx - textfx tools - sort out only unique then sort sensitive. I got suspicious when i noticed about 400 lines where removed. I then did a comparison between old dic and new dic and it turns out that textfx doesn't handle too many lines very well. It removed words when it shouldn't and left duplicate words in. I download some softwares, like duplicateremover, textmechanic - both failed and textcrawler - worked 100% as intended. Do you have any recommendations?

Yes, there is an option in PTG "Check for duplicates" that decodes the whole .dic and generates a list of the repeated words which can be exported into a .txt file.

It will show 1000s of words are "legacy" (99.99% or so were there before I grabbed the project in 2013). I must code a feature into PTG that will merge the flags, not sure when I will have the time to do it then.

This afternoon I will add the bullet words, then I will comment here.

marcoagpinto commented 5 years ago

Done!

42667) asshole (+plural +'s) 42668) bachelorette (+plural +'s) 42669) badass (+plural +'s) 42670) dumpster (+plural +'s) 42671) radiosterilise

marcoagpinto commented 5 years ago

@jorgk3

Are there more words I should add?

Thank you!

marcoagpinto commented 5 years ago

I also fixed the "device/SM" from "vice" that used a prefix.

Ding-adong commented 5 years ago

42619) defensive + indefensive (it is an adjective - it doesn't have plural) It is a noun too. e.g. Stock-market terminology - defensive and plural for more than one stock - defensives

Let's look at the words around defence etc.

defence (third-person singular simple present defences, present participle defencing, simple past and past participle defenced)

defence/5DGmS remove defenceman/men no p since we need defensive defenceless/PY remove defencelessness defend/7bdrS Vuv not needed defendant/MS defendee/MS remove defended/U see un...at the bottom. defenestrate/DSG defensive/IPSY defensibility/M not sure M is needed defensible/IY covers defensibly defensibly/I duplicate undefended undefendable

All sorted as far as i know and should tidy up the entries.

In AFF file why is there man's; men's; woman's; but no women's ?

I am going through my user dic as there are over 1000s words of sorts in there. It is used by Subtitle Edit to prevent spellcheker from flagging up silly words. If there are some valid missing words i will let you know.

Get well, cheers

marcoagpinto commented 5 years ago

In AFF file why is there man's; men's; woman's; but no women's ?

Where?!

I think I fixed it months ago: SFX 5 Y 16 SFX 5 0 swoman [bdknmt] SFX 5 0 swoman [aeiou][bdklmnt]e SFX 5 0 woman [^aeiou][bdklmnt]e SFX 5 0 woman [^bdklmnt]e SFX 5 0 woman [^bdeknmt] SFX 5 0 swomen [bdknmt] SFX 5 0 swomen [aeiou][bdklmnt]e SFX 5 0 women [^aeiou][bdklmnt]e SFX 5 0 women [^bdklmnt]e SFX 5 0 women [^bdeknmt] SFX 5 0 swoman's [bdknmt] SFX 5 0 swoman's [aeiou][bdklmnt]e SFX 5 0 woman's [^aeiou][bdklmnt]e SFX 5 0 woman's [^bdklmnt]e SFX 5 0 woman's [^bdeknmt] SFX 5 0 women's [^bdeknmt] <- Fixed here?

There are still other flags to fix (legacy) but I need time to do it and must be done carefully to avoid messing up.

Ding-adong commented 5 years ago

My aff was out of date. I looked at the PT pull down menu and it doesn't show women's under 5. It doesn't show up when using 5 under defence.

men have: SFX m 0 smen's [bdknmt] SFX m 0 smen's [aeiou][bdklmnt]e SFX m 0 men's [^aeiou][bdklmnt]e SFX m 0 men's [^bdklmnt]e SFX m 0 men's [^bdeknmt]

women have: SFX 5 0 women's [^bdeknmt]

looks like: SFX m 0 swomen's [bdknmt] SFX m 0 swomen's [aeiou][bdklmnt]e SFX m 0 women's [^aeiou][bdklmnt]e SFX m 0 women's [^bdklmnt]e is missing

marcoagpinto commented 5 years ago

@Ding-adong

SFX m Y 20 SFX m 0 sman [bdknmt] SFX m 0 sman [aeiou][bdklmnt]e SFX m 0 man [^aeiou][bdklmnt]e SFX m 0 man [^bdklmnt]e SFX m 0 man [^bdeknmt] SFX m 0 smen [bdknmt] SFX m 0 smen [aeiou][bdklmnt]e SFX m 0 men [^aeiou][bdklmnt]e SFX m 0 men [^bdklmnt]e SFX m 0 men [^bdeknmt] SFX m 0 sman's [bdknmt] SFX m 0 sman's [aeiou][bdklmnt]e SFX m 0 man's [^aeiou][bdklmnt]e SFX m 0 man's [^bdklmnt]e SFX m 0 man's [^bdeknmt] SFX m 0 smen's [bdknmt] SFX m 0 smen's [aeiou][bdklmnt]e SFX m 0 men's [^aeiou][bdklmnt]e SFX m 0 men's [^bdklmnt]e SFX m 0 men's [^bdeknmt]

The .aff description written by the original creator states that the "m" flag is for "man" and "men" use (male).

marcoagpinto commented 5 years ago

@Ding-adong

SFX 5 Y 16 SFX 5 0 swoman [bdknmt] SFX 5 0 swoman [aeiou][bdklmnt]e SFX 5 0 woman [^aeiou][bdklmnt]e SFX 5 0 woman [^bdklmnt]e SFX 5 0 woman [^bdeknmt] SFX 5 0 swomen [bdknmt] SFX 5 0 swomen [aeiou][bdklmnt]e SFX 5 0 women [^aeiou][bdklmnt]e SFX 5 0 women [^bdklmnt]e SFX 5 0 women [^bdeknmt] SFX 5 0 swoman's [bdknmt] SFX 5 0 swoman's [aeiou][bdklmnt]e SFX 5 0 woman's [^aeiou][bdklmnt]e SFX 5 0 woman's [^bdklmnt]e SFX 5 0 woman's [^bdeknmt] SFX 5 0 women's [^bdeknmt]

I know these 4 could perhaps be added here,

But, isn't it risky to do?

Ding-adong commented 5 years ago

I forgot to change the flag. Risky, no reason why not as i change the aff file and included the missing 4, also correct flag ha ha, and y count from 16 to 20. It worked and showed women's - produced the same result as m (man)

It's your software, up to you.

marcoagpinto commented 5 years ago

@Ding-adong

I added the 4 missing ones at the end: SFX 5 Y 20 SFX 5 0 swoman [bdknmt] SFX 5 0 swoman [aeiou][bdklmnt]e SFX 5 0 woman [^aeiou][bdklmnt]e SFX 5 0 woman [^bdklmnt]e SFX 5 0 woman [^bdeknmt] SFX 5 0 swomen [bdknmt] SFX 5 0 swomen [aeiou][bdklmnt]e SFX 5 0 women [^aeiou][bdklmnt]e SFX 5 0 women [^bdklmnt]e SFX 5 0 women [^bdeknmt] SFX 5 0 swoman's [bdknmt] SFX 5 0 swoman's [aeiou][bdklmnt]e SFX 5 0 woman's [^aeiou][bdklmnt]e SFX 5 0 woman's [^bdklmnt]e SFX 5 0 woman's [^bdeknmt] SFX 5 0 women's [^bdeknmt] SFX 5 0 swomen's [bdknmt] SFX 5 0 swomen's [aeiou][bdklmnt]e SFX 5 0 women's [^aeiou][bdklmnt]e SFX 5 0 women's [^bdklmnt]e

Could you check if this is 100% the way of it working?

Thank you!

Also, what is your real name to add to the README file (FLAG "5" FIXED BY BLAH BLAH"

Ding-adong commented 5 years ago

Just use Ding-adong. I don't give out real names over the internet.

marcoagpinto commented 5 years ago

oki

marcoagpinto commented 5 years ago

@Ding-adong

Is this okay?:

2019-02-01 — Improved flag "5" thanks to the GitHub user Ding-adong: Some "swomen's" and "women's" entries were missing.

marcoagpinto commented 5 years ago

By the way, I have released an update for Proofing Tool GUI today: http://proofingtoolgui.org/default.htm#downloads

Now I need some rest.

Ding-adong commented 5 years ago

It's fine. Still going through my list. I can't believe you don't have bonk in the dictionary. Check back tomorrow.

Ding-adong commented 5 years ago

All done. There maybe some repeats as i was doing it all in one page.

able/nVvYNTDP add DP so you can remove abled ableness Aboriginal/SM accelerant/S accension aerogramme pl African-American/S allocute/SD allotriophagy ambiversion amnio/S animatable can't add able to animate. anthropomancy anti-aggression anti-collision anti-corrosion anti-depression anti-recession anti-subversion aphrodisiacal/S apple-cart arachnophobe assemblance asshole/S assimilator/S Astra British name of a car astrometric/S attaboy bachelorette/SM backswing/S badass/S baldie ball-breaker BarcaLounger correct use of u/l case barrel-bombing barrel-chested barrel-roofed barrel-vaulted barrel/GMDS6 believeth bevvy/DGS bicep/S bioelectronically biohazardous biomechanoid bioscan BitInstant bladder/MSd blindsided blondie blowjob/S bodge/SRD bonk/drS boogeyman/M boondoggle bosting on its own boyo on its own brassic Bremner surname brick/drS remove bricker brickie/S broadcast/SdAR bruit/Sd bruv/S bruvver/S bucky/S slang word for gun budgetarily bunce Bundt bustier/S butt/SMZ Z to include word (bacon and egg) butty yum yum butter/drZS cannae capsize/SD cauda CD/S celebre charver
check-up/S childbed chippy/SM circular/qYQPMS civvy/S clank/DMkGSr clasp/SUGDM class/57mS clit/Z remove clitty cloak/DMGSC for decloak etc coffee-house cohabitate conductant conservatorship consigliere contessa cool-bag cossie counter-aggression counter-insurgency coupe cozzie creme crikey cruft cryobank cub/dWw3SD1GMZ add Z for cubby - short for cubbyhole cubby cuz Cyclopean célèbre either or both below decloaking decontaminant/S defensive/IS deke/DG depersonzlised potential spelling mistake derepression descension +al device/S - noun dipshit dirtbag +s discommendation disfiguration +s dishumour disinvesting diss/SGD dissension dissensus dissentient doofer +s doofus douchebag +s downtown drabs dribs druggist +s drugstore +s dumbass dumpster +s earless eclosion elastoplast elusion emerse +ed sion endeavourment engrams erase/NDRLS erm escarole et cetera can the dic hold two words? euthanisation +s euthanising evidentiary evulsion examinate expat +s facie as in prima facie faggoty fannying fasciitis faucet favourless feedings fella +s 's fessing fest fiance fiancee fide filofax first-born flat-screen flatline flavourfully foie footie +s forbearers fortepiano fricking fructuous fuckwad gals gasohol GCE GCSE gelato geo Geordies gib gimbal +led gimme glamourise glamorisation glamorised glamoriser glamorising u for ise and no u for ize glamourless glamourpuss GLC gnarly goddamn goddamn +ed it gofer goofer gook +s grandkid +s grav grippy gullibly gumball gyppo gyrocopter haphephobia happenstance harbourward hardliner Harlem harmonists hasta hath haute haviour haviour havioured haviours herbology hocked +ing hombre honorous honourability honouree honouress honoursman honourworthy hooky hopers horlick +s hors horseshit hostiles hot-wire hovercrafts howay humourer humourful humourise humoursome huzzah hydrobromide hypospray indecenter not a valid word independency infomercial ingenue inhesion innit instal intel jacksie jalapenos jammy jankers jerries jetpack jobcentre jubbly juvie kart +s kayaking khazi kick-ass kickabout kiddy klick +s klutz knobbed knockdown kraut kroners labourism labourist +pl labouristic labourite labourites labourless laboursome laboursomely laboursomeness lairy lateen latina leftie leucotome leucotomies lez lezzie +s life-force limey livener lookie looky looning lord/DcSMG add f for underlord/overlord lotta macked +ing magnificents majoring malodour +pl manhunter manky marketeering martialled masse matey matrimonials maxed mazel mea med +s mid-watch millionairess mine-detector mischaracterised missable MIT molecular/CQY molecularised moonshiner moonshiner +s moonshining motherfucker +s mucker +s nanite +s narked nasion naused nauses necrotise negatory neighbourless neighbourlike nerk neuroanatomic neurosynaptic neurotherapy nicker nighter niner non-admission non-comprehension non-conclusion non-decision non-inclusion non-possession notarised nuff numpty nutjob nutter odourful odourlessly odourlessness offensive/IYPSM oncological opticals oralists osteoplasty outgeneralled outmanned overpersuasion paedo panto papi paviour +pl usa-pavour peasy percentre +s perv +s pervo pervy phenobarbital phenylethylamine pilau pillock pils pina pinata pissant plasmatic platonically plosion +al pneumoencephalogram poncing poncy poo poofter poppadom +s popsicle porkie portaloo positronic post-mortem postdate/GDS potless pouty pow poxed poxy prannet prannock pranny pratting precess/GDSNx precognitively prelim prepped prepping pretension/S prima primavera priors proctology program/BRGSJDMC protege +s protegees provolone punking puree putz radiosterilise/Dn rambunctious rancoured rancourless rancourproof rancours rathole razzle reaccession recalibration +s recce recision recon reconstructor +s recuse +d refamiliarise refiltration regionals regs reimagining reindeers reinitiated reinvasion relatable repagination repoing repressurised repressurising reprovision repulser respirate restabilising restante retinopathy rhetoricals rhetoricals ribcages ricin righty rindless roadsweeper rock'n'roll rollocking rollover roomie roomies rooming ruckus rumourer rumourist rumourmongering antirumour safehouse saltine salvagers salved sandboy sanguine/YnC sapien sarge sarkiness sarky sarnie satay satcom sauvignon saviouress saviourship savourable savourer savouringly savourless savourlessness savourly savorous savorously savoursome scaffolders schmoozing schmuck schoolyard scientifical scotograph scouser scow scrumping scrutineering securitisation +s seize/BD sellotape semihonour sentients septicoloured shake-down shalt shapeable sharecropper shill shipwide shite shuck +s +ing sicko sidearm +s silverback simp skank +s skiving slappers slayed ing +s sleazebag sleazeball slickers slinged slingshotting slurpy smackhead smartass smartish smithereen snarfing snuck sonofabitch sophomore souffle spearman spec +s spic spick spiffing squinty staff/ADGSR stat stenograph/SGDZ stevia stimulators stodged strategise +d streetcar strewth strippergram stronging strumped strumping sub-editorship subatomically subcentre subcommission subdecision sublight submachine subprime supercar supercars supersession surveil surveilled surveilling swallowers swapping's sweared swordsmith swotting symbiote synopsised synthale synthehol takedown takeout tannoy tarp taser/Sd tasered tasseography tearaways technicolour techo tegretol telecommuted telled terawatt tete thoron thready tiddler/S tiffin tightener tikka timp tis tiser toblerone tod toerag toerags toke tonk torygraph tosspot towners tox toxology trach trailhead tranq transwarp trattorias trebuchet trebuchets trepidatious tri tricorder tricorders trifecta tryer tumoural tumourlike antitumour antitumoural turbolift turbolifts tux tuxedo twink twonk uber unpleasantries upstate uxb vacationing vacuuming valedictorian vapourability vapourable vapoured vapourer vapourers vapouring vapouringly vapourings vapourless vapourlike vapourtight vapoury varmints vid vig vigourless vigours visual/sYQSq3 voiceprint wack waggon wakey wallbanger wallies wally warrantless washcloths washroom weaverbird/S weeing welch welches wetback/S whassup whatcha whirrs whoremonger whup wifey wildflowers windshield wiseass wish-list woah woodhouse world-view wotcha wunderbar wuss xenoanthropology xenobiological xenobiology yakking yip/S youse zipper zit/S ziti

marcoagpinto commented 5 years ago

Hey hey!

Thank you very much for the wordlist.

My holiday is gone and I will only be able to start adding the words next week.

Meanwhile, I have released the 1-FEB speller for Thunderbird and Firefox (V2.69).

Now I will only release a new version near 1-MAR and it will be released for Thunderbird, Firefox, OpenOffice and LibreOffice.

Meanwhile, I have been improving Proofing Tool GUI and I have been able to incorporate into it the regex library (which I didn't know PureBasic had built-in). With the regex I found out that there are more codes in the rules such as the dots and it handles better more complex rules, producing a larger wordlist (extracted).

Thanks!

Kind regards,

Ding-adong commented 5 years ago

What regex do you need and what for? I have some regex that convert usa to GB easily.

marcoagpinto commented 5 years ago

To decode the .aff file :-)

I thought it would be extremely faster than doing it by "hand".

But the speed is about the same, so, what is slowing down the process isn't in that part of the code I replaced.

Ding-adong commented 5 years ago

Some more names, all upper case, for the dic. Just copy and paste, then remove duplicates. I have tidied some duplicates in your dic eg. example/M example/S to example/MS.

Too many chars... uploaded to https://github.com/Ding-adong/aoo-mozilla-en-dict/tree/master/en_GB%20(Marco%20Pinto) filename names.txt

marcoagpinto commented 5 years ago

Hello!

The duplicate words it is something I want to deal with in a next version of Proofing Tool GUI.

I will build a feature that will merge all duplicate flags.

It is in my TODO list.

Ding-adong commented 5 years ago

I was playing with PT and use bulk import for names and takes tooooooo long. Notepad++ does it in <10 seconds using textfx.

marcoagpinto commented 5 years ago

I was playing with PT and use bulk import for names and takes tooooooo long. Notepad++ does it in <10 seconds using textfx.

It is normal.

PureBasic is slow at strings because every access to a string, it seeks for the last position instead of having the size stored.

For years that people complain in the forum, but so far, "fast strings" haven't been implemented.

Ding-adong commented 5 years ago

iTouch iZombie

marcoagpinto commented 5 years ago

iTouch iZombie

42973) iTouch (+plural +'s - Wiktionary) 42974) iZombie (+'s - name - Wikipedia) <- Name of TV Series

marcoagpinto commented 5 years ago

@Ding-adong

What happened?

Hours ago your account vanished from GitHub and all your posts.

Now they are back!

I have started adding your words, but for each word I add I look for related ones.

I have already added some.

Ding-adong commented 5 years ago

Account was flagged by mistake. All sorted.

On Wed, 6 Feb 2019 at 21:51, Marco A.G.Pinto [Masked] < wrote:

Preview: @Ding-adong What happened? Hours ago your account vanished fr This email is forwarded from a MASKED EMAIL you created using Blur https://dnt.abine.com/#help/faq/faq-whataremaskedemails. IF THIS IS SPAM, CLICK HERE TO BLOCK. <https://dnt.abine.com/#/block_email/3nbwjeh9ga54@opayq.com/

Want to shop safely and privately online? Get Blur Premium https://dnt.abine.com/#premium.

@Ding-adong https://github.com/Ding-adong

What happened?

Hours ago your account vanished from GitHub and all your posts.

Now they are back!

I have started adding your words, but for each word I add I look for related ones.

I have already added some.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/marcoagpinto/aoo-mozilla-en-dict/issues/28#issuecomment-461202061, or mute the thread https://github.com/notifications/unsubscribe-auth/AsJ-ROZS16VYy_OWEFJS3tgbuFsE5cGbks5vK05HgaJpZM4ZsG6w .

Ding-adong commented 5 years ago

account/MBlSGp joy/pMDG6jSc pigeonhole/SMDG no hyphen

marcoagpinto commented 5 years ago

account/MBlSGp joy/pMDG6jSc pigeonhole/SMDG no hyphen

43048) account/MBlSGp (Ding-adong) 43049) joy/pMDG6jSc (Ding-adong) 43050) pigeonhole (removed hyphen - Ding-adong + added +er +ers +'s)

Ding-adong commented 5 years ago

accrementition acumentin acuminate/DGnS acuminose acuminulate aeon/SMWO aeonian altitudinarian aptitude/SMO remove O attitudinal/Y remove add attitudinally axonometric but/DAGS remove A then add rebut below as root word. butter/drZA button/UdSA carbon-12 and -13 carbona carbonado carbonification carbonify citrinin citrination configuration/oOAM criminous/PY cytogenesis cytogeny dominical dominie endotherm/SOW endothermis erythema/OW fiction/MSo^ remove O habitude haemagglutinate/SDGn haemal haemangioma/S haematemesis haematic/S haematin/S haematocele haematoma/S haematophagous haematopoietic/1 haematoporphyria haemocyanin haemoglobinuria haemolsis haemolymph haemostat/S industry/oMSOG industry/oOMS intervention/OSM3 remove interventionist/S jealous/YPZ remove Z lymphoedema mesoderm/OW nymph/SMOW nympha nymphal nympholept nymphology nymphaeum nymphomaniac/SO oedematic oedemic organism/OMWS1 orthopaedic/SYZ oxyhaemoglobin papilloedema parenchyma/OW preganglionic postganglionic prep/SMD rebut/SDGOBL^ remove rebutment rebuttable smartish symbiont/MS thready tibialis volume/MSD

marcoagpinto commented 5 years ago

Hey hey!

I believe I have added most of your words, but I am brain toasted and can't remember.

I have also created a paragraph in the Mozilla site saying that if anyone finds words that both Oxford and Collins say it is US, for they to tell me and I will remove.