en-wl / wordlist

SCOWL (and friends).
388 stars 78 forks source link

New proposed words from Mozilla dictionary #137

Closed jorgk3 closed 8 years ago

jorgk3 commented 8 years ago

I have been analysing the words that are contained in the Mozilla-maintained dictionary but which are not contained in SCOWL level 60.

The following 337 words are either already contained in SCOWL or have a promising rating to be included. Where they are included at level 70, Mozilla would appreciate having them promoted to level 60, since this is the level Mozilla is using. This way, Mozilla wouldn't have to administer them.

abridgement/SM absorbance/S absorbancy/M acetyl acknowledgement/SM actin acyl advocator/MS adwares aggregator/MS agonist alkoxy anecdotally anonymization/SM anonymize/DSG anthropomorphize/DS antisense apatosaurus/M archaeoastronomy/M archaeologic archaeomagnetic archaeomagnetism archeologically arXiv/M aryl/SM assignee/M astroarchaeology/SM astrobiology/M astrobleme/S asynchronicity aurei aureus auteurs autocomplete/S avant-garde axe/M badging benzyl biodiesel/M bioinformatic/SM biosyntheses biosynthesis biotech/M blogroll/SM bloviate/GNDS bloviator/MS blowjob/MS bookselling botnets broadcasted/A cancelled/U canceller/M cancelling capita caravanserai carboxylic cDNA cerevisiae/SM charcuterie chemistries ciphertext/S closable/IE codec/SM codon/SM coli colonoscope/SM commenters compositeness concurrents conferable config/MS conformant conmanly corrigibility/M corrigible corruptibly court-martial/SDG crappiness crimeware/M cryonic cryptologist/MS cryptosystem/S cul-de-sac cultivar/SM cyber cytokine/MS datasheet/SM decertify/XNGDS degenerations dehydrogenase/M deliverables/U dequeue/GDS designee designings dialoged dialoging dialogued dialoguer dialoguing diatomaceous dihydro disarrangements disclaimable dissentious djinn donator/SM durian/SM eBook/MS eCommerce/M eldritch elicitor/MS encyclopaedia enqueue/DSG eschatologist/MS estoppel euthanized euthanizes euthanizing exacta/S exactable exactingness exactions exactor/SM exon/SM experimentalism faggoting faux filesystem/MS filmography financials fluidize/SG flyer/SM foci forma fracker/S freegan/S fundraising gamify/DSGN gastroenterologist/M gastroenterology genomics gigajoule/MS gigapixel/MS glycerine grande grey/TGDRSM greybeards guestbook/MS hentai hexanes hijab/S hippopotami holdem idolator/SM inactives inactivities incentivize/DSG incorrigibleness inkjet/SM intermediacy/S intermediated intermediateness intermediating intermediation/ES intermediator/MS interruptible/U intersex intersexual/MS intersexualism intersexuality iPods jewellery judgement/MS kbps keylogger/MS keylogging/SM kinase labelled/U lector/SM lepidopterist/SM limnological limnologist/MS limnology/M linguini linguistical mage/MS malwares mammalia meetup/MS megajoule/M mesothelioma/M metadata/M meth methoxy might've migrator/MS millennia misandrist/SM misandry miscommunications misjudgement/SM mitigations modeller/MS modelling/MS motorsport/MS mRNA multicast murine musculus must've namespace/SM nano natively neurosciences neuroscientist/MS neurosurgical newswires octopi offsite oligo onsite opposable opposer oxidase parallelization/MS parallelize/GDS parkour permalink/MS permittee phlebotomist/SM phlebotomize/DSG pho phosphorylate/GDNS pickiness plaintext polynucleotide/MS popup/SM poutine/S prejudgement/SM proclaimable procreations profiler/SM programmatically pronate/NDSG pronator/SM proprietorships propyl pseudorandom/Y quinoa racoon rasterization/M rasterize/GDRS reappointments recency recurse/DSG recuse/GDS reductase/M relocations renominations repartitions resize/DRSB rheumatological rheumatologist/MS rheumatology/M rootkit/SM rotatably sabre/SM sativa savoir schnaps schrod/S scot-free screensaver/MS searchable selfing selfism selfist/S seraphim serine shemale/SM shitloads showtimes signalling signup/SM sitemap/MS smoulder/G snarkily sommelier/MS spelt spick/S spywares steampunk stent/MS substituent/MS subsumptions syllabi synches synesthesia synesthete/S synesthetic synthase/MS tagline/MS taxa taxon telecom/M teleport/GSD teleportation terapixel/MS testcase/SM testsuite/MS textbox/MS thaliana therebetween theremins timelines toolbars traceur/SM trackback/MS transfect/GDS transgenderism transgene/S triages triaging tRNA/M tweep/S uncheck/SG undesignated unironic unironically vertebrata volcanological volcanologist/SM volcanology/M weaponize/DSG webdesign/MS whitepaper/MS wildcard/SM

kevina commented 8 years ago

Expanded and sorted list: to-add.txt

Will do additional processing later.

jorgk3 commented 8 years ago

Thanks a lot for considering the Mozilla input.

kevina commented 8 years ago

I added a few. A good number are in the dictionary but are considered variants. Here is what is left:

absorbances absorbancy absorbancy's actin advocator advocator's advocators adwares aggregator aggregator's aggregators alkoxy anecdotally anonymization anonymization's anonymizations anonymize anonymized anonymizes anonymizing anthropomorphize anthropomorphized anthropomorphizes antisense apatosaurus apatosaurus's archaeoastronomy archaeoastronomy's archaeologic archaeomagnetic archaeomagnetism arXiv arXiv's aryl aryl's aryls astroarchaeologies astroarchaeology astroarchaeology's astrobiology astrobiology's astrobleme astroblemes asynchronicity aurei auteurs autocomplete autocompletes avant-garde badging biodiesel biodiesel's bioinformatic bioinformatic's bioinformatics biosyntheses biotech's blogroll blogroll's blogrolls bloviate bloviated bloviates bloviating bloviation bloviator bloviator's bloviators blowjob blowjob's blowjobs bookselling botnets capita carboxylic cDNA cerevisiae cerevisiae's cerevisiaes charcuterie chemistries ciphertext ciphertexts closable codec codec's codecs codon's coli colonoscope colonoscope's colonoscopes commenters compositeness concurrents conferable config config's configs conformant conmanly corrigibility corrigibility's corrigible corruptibly court-martial court-martialed court-martialing court-martials crappiness crimeware crimeware's cryonic cryptologist cryptologist's cryptologists cryptosystem cryptosystems cul-de-sac cultivar's cyber cytokine cytokine's cytokines datasheet datasheet's datasheets decertification decertifications decertified decertifies decertify decertifying degenerations dehydrogenase's deliverables dequeue dequeued dequeues dequeuing designee designings dialogued dialoguer dialoguing diatomaceous dihydro disarrangements disclaimable disclosable disintermediation disintermediations dissentious donator donator's donators durian durian's durians eBook eBook's eBooks eCommerce eCommerce's elicitor elicitor's elicitors enqueue enqueued enqueues enqueuing eschatologist eschatologist's eschatologists euthanized euthanizes euthanizing exacta exactable exactas exactingness exactions exactor exactor's exactors exon's exons experimentalism filesystem filesystem's filesystems filmography financials fluidize fluidizes fluidizing forma fracker frackers freegan freegans gamification gamified gamifies gamify gamifying gastroenterologist gastroenterologist's gastroenterology gigajoule gigajoule's gigajoules grande greybeards guestbook guestbook's guestbooks hentai hexanes hijab hijabs holdem idolator's inactives inactivities incentivize incentivized incentivizes incentivizing inclosable incorrigibleness inkjet inkjet's inkjets intermediacies intermediacy intermediated intermediateness intermediating intermediation intermediations intermediator intermediator's intermediators interruptible intersexual intersexual's intersexualism intersexuality intersexuals iPods kbps keylogger keylogger's keyloggers keylogging keylogging's keyloggings lector lector's lectors lepidopterist lepidopterist's lepidopterists limnological limnologist limnologist's limnologists limnology limnology's linguistical mage mage's mages malwares mammalia meetup meetup's meetups megajoule megajoule's mesothelioma mesothelioma's metadata metadata's meth methoxy might've migrator migrator's migrators misandrist misandrist's misandrists misandry miscommunications mitigations motorsport motorsport's motorsports mRNA multicast murine musculus must've namespace namespace's namespaces nano natively neurosciences neuroscientist neuroscientist's neuroscientists newswires offsite oligo onsite opposable opposer parallelization parallelization's parallelizations parallelize parallelized parallelizes parallelizing parkour permalink permalink's permalinks permittee phlebotomist phlebotomist's phlebotomists phlebotomize phlebotomized phlebotomizes phlebotomizing pho phosphorylate phosphorylated phosphorylates phosphorylating phosphorylation pickiness plaintext polynucleotide polynucleotide's polynucleotides popup popup's popups poutine poutines proclaimable procreations profiler profiler's profilers programmatically pronate pronated pronates pronating pronation pronator pronator's pronators proprietorships propyl pseudorandom pseudorandomly quinoa rasterization rasterization's rasterize rasterized rasterizer rasterizes rasterizing reappointments recency recurse recursed recurses recursing recuse recused recuses recusing reductase's relocations renominations repartitions resizable resizer rheumatological rheumatologist rheumatologist's rheumatologists rheumatology rheumatology's rootkit rootkit's rootkits rotatably sativa savoir scot-free screensaver screensaver's screensavers searchable selfing selfism selfist selfists shemale shemale's shemales shitloads showtimes signup signup's signups sitemap sitemap's sitemaps snarkily sommelier sommelier's sommeliers spywares steampunk stent's substituent's substituents subsumptions synesthesia synesthete synesthetes synesthetic synthase synthase's synthases tagline tagline's taglines telecom telecom's teleport teleportation teleported teleporting teleports testcase testcase's testcases testsuite testsuite's testsuites textbox textbox's textboxes thaliana therebetween theremins timelines toolbars traceur traceur's traceurs trackback trackback's trackbacks transfect transfected transfecting transfects transgenderism transgene transgenes triages triaging tRNA tRNA's tweep tweeps uncheck unchecking unchecks undeliverables undesignated uninterruptible unironic unironically vertebrata volcanological volcanologist volcanologist's volcanologists volcanology volcanology's weaponize weaponized weaponizes weaponizing webdesign webdesign's webdesigns whitepaper whitepaper's whitepapers wildcard wildcard's wildcards

@biljir do you see anything in this filtered list worth adding?

biljir commented 8 years ago

Quite a lot of them actually. Two lists, one of words worth adding with all their forms, and then one where only the plural form appears on the list (presumably because the singular has already been added). These are based on my idea of words that Mozilla ought to recognize, not on the much more limited 12dicts vocabulary.

anthropomorphize blowjob (more often one word than two) court-martial (the most common form of the word - also, court-martialled/ing should be recognized as variants of the single l forms) euthaniz(ed/es/ing) guestbook (I think more common than the two word form) hijab meetup metadata meth might've must've offsite onsite pickiness popup (I think this is more common than the hyphenated form) quinoa recuse screensaver (might be considered archaic now, though I still use one...) steampunk teleport (close call) teleportation uninterruptible weaponize

For the plural list, I'm just noting that the plural is valid and should be included if the singular is.

auteurs botnets chemistries commenters financials greybeards iPods shitloads theremins timelines toolbars

jorgk3 commented 8 years ago

I'm glad you finding some "useful" words amongst the Mozilla suggestions. Personally I think "anecdotally" should be promoted from the large size to the normal size.

How about rootkit/SM - https://en.wikipedia.org/wiki/Rootkit wildcard/SM - https://en.wikipedia.org/wiki/Wildcard_character ?

And since you already have "showtime", why not add "showtimes"?

I've just noticed that in techie circles often the one-word-variant is used, as in: webdesign whitepaper tagline popup textbox sitemap signup testcase testsuite whereas the rest of the world writes two words.

kevina commented 8 years ago

Thanks @biljir, I added nearly all your suggestions but court-martial as I do not include hyphenated words and even if I did, both court and martial are words already in the dictionary. A added a few plurals, adding the others will be difficult for technical reasons but i intend to revisit the issue.

@jorgk3 I added anecdotally rootkit/SM and wildcard. I looked up the compound words and added a few. Words like webdesign might be considered wrong by some.

biljir commented 8 years ago

Do you have "martial" as a verb (which, as a standalone word, it would not be)? If not, then court-martialed and court-martialing will not be accepted.

kevina commented 8 years ago

Thanks Alan for pointing that out. You are right I do not have "martial" as a verb. Unfortunately, I do not support hyphenated words in SCOWL yet so there is no way to add it just yet. See issue #66.