Closed redaktor closed 9 years ago
yeah, i agree. 'towns'.pluralize() should be 'towns'. gimme a sec.
yeah, maybe I got it already - we are doing the same now ;)
In general it must be
[delete/rollback my commit] - I am deeply sorry.
singularize_rules
SHOULD cover 'plural to singular' AND 'singular to singular'.
;) no problem. can you check out the current version? I've added some towns.pluralize()==towns
tests and they look so-far-so-good
:+1: will check. If you've got an additional minute: This is a WIP proposal of the "dictionary" I mentioned above. Basically this could be a helper for development. It is not complete. Please read the TODO comments and see the generator functions at the end. And when I would add 'rules' like 'words' then logic and (language or context specific) data is fully seperated for development. It is a simple generator function where we could generate the data module part with... But no it is easy for our "machines" ;) How do you think?
@spencermountain Yep. works. I need to correct the dictionary for the nouns a bit and now there are some dups. already covered by rule :
[
['move', 'moves'],
['photo', 'photos'],
['video', 'videos'],
['rodeo', 'rodeos'],
['stomach', 'stomachs'],
['shoe', 'shoes'],
['epoch', 'epochs'],
['zero', 'zeros'],
['avocado', 'avocados'],
['halo', 'halos'],
['tornado', 'tornados'],
['tuxedo', 'tuxedos'],
['sombrero', 'sombreros']
];
The dictionary also adds some more irregular plurals not covered by rule and compresses them by replacing the plural by singular.slice(0,-2) ...
updated dictionary, haven't looked what could go to rules (the other way around), checking now ... off course the dictionary generators could become gruntified itself. As said it is work in progress. the replace function will become short syntax, for now e. g. irregular plurals looks like
/* singular nouns having irregular plurals */
var lang = en;
var noun_irregulars = (function() {
var zip = [ [ 'child', '=ren' ],
[ 'person', 'people' ],
[ 'leaf', '_av$' ],
[ 'database', '=s' ],
[ 'quiz', '=z$' ],
[ 'goose', 'ge$e' ],
[ 'phenomenon', '_a' ],
[ 'barracks', '=' ],
[ 'deer', '=' ],
[ 'syllabus', '_i' ],
[ 'index', '_ic$' ],
[ 'appendix', '_ic$' ],
[ 'criterion', '_a' ],
[ 'i', '_we' ],
[ 'man', '_en' ],
[ 'she', 'they' ],
[ 'he', '_t=y' ],
[ 'myself', 'ourselv$' ],
[ 'yourself', '_lv$' ],
[ 'himself', 'themselv$' ],
[ 'herself', 'themselv$' ],
[ 'themself', '_lv$' ],
[ 'mine', 'ours' ],
[ 'hers', 't_irs' ],
[ 'his', 't_eirs' ],
[ 'its', 'the_rs' ],
[ 'theirs', '=' ],
[ 'sex', '=e_' ],
[ 'narrative', '=s' ],
[ 'addendum', '_a' ],
[ 'alga', '=e' ],
[ 'alumna', '=e' ],
[ 'alumnus', '_i' ],
[ 'bacillus', '_i' ],
[ 'beau', '=x' ],
[ 'cactus', '=$' ],
[ 'château', '=x' ],
[ 'corpus', '_ora' ],
[ 'curriculum', '_a' ],
[ 'die', '_ice' ],
[ 'echo', '=$' ],
[ 'embargo', '=$' ],
[ 'foot', 'feet' ],
[ 'formula', '=s' ],
[ 'genus', '_era' ],
[ 'graffito', '_ti' ],
[ 'hippopotamus', '_i' ],
[ 'larva', '=e' ],
[ 'libretto', '_ti' ],
[ 'loaf', '_av$' ],
[ 'matrix', '_ic$' ],
[ 'memorandum', '_a' ],
[ 'mosquito', '=$' ],
[ 'opus', '_era' ],
[ 'ovum', '_a' ],
[ 'ox', '_=en' ],
[ 'radius', '=$' ],
[ 'referendum', '_a' ],
[ 'tableau', '=x' ],
[ 'that', '_ose' ],
[ 'that', '_$$' ],
[ 'thief', '_ev$' ],
[ 'this', '_$e' ],
[ 'tooth', 'teeth' ],
[ 'vita', '=e' ] ];
var main = zip.map(function (arr) { arr[1] = arr[1].replace('=',arr[0]).replace('_', arr[0].slice(0,-2)).replace(/\$/g,'es'); return arr; });
if (typeof module !== "undefined" && module.exports) module.exports = main;
return main;
})();
oh, this idea for pulling these out into a file is good. They can go in the lexicon. Good one!
cool. Please note that the $ replacer in the zip function must be stripped in regexes like in the zip.map function above. It might be better to use another small special character. Note to me: think before you code ;)
Hm - the last commit does not work properly because
in
pluralize_rules
we have rules for both singular to plural AND plural to plural while insingularize_rules
it is only plural to singular(???)
In general I am working on a factory method called "dictionary" based on the "words" and "rules" and this can be autotranslated by our database to several languages covering the ngram and metrics etc.