words / dale-chall-formula

Formula to find the grade level according to the (revised) Dale–Chall Readability Formula (1995)
https://wooorm.com/readability/
MIT License
31 stars 1 forks source link

Missing dictionary #1

Closed localjo closed 8 years ago

localjo commented 8 years ago

It appears that in order to use this module, I need to come up with my own count of "difficult words", but I have no idea how to do that. Is there a particular dictionary I should be using? Does a program exist to compare words against that dictionary? Or is that something that needs to be built in order to use this formula?

wooorm commented 8 years ago

Yup, and the dictionary is listed in the readme as well :)

See dale-chall for a list of words which count as “familiar”.

Does that help?

Oh, and I suggest using parse-english for properly counting words / sentences!

localjo commented 8 years ago

Sweet! Thanks for the tips Titus! Sorry I missed that link to the dictionary in the readme. Thanks so much for all your work on these projects!

wooorm commented 8 years ago

Oh no problem at all, and thank you! Hope you make something cool with ’em :)

localjo commented 8 years ago

Planning on it. 😎 By the way, what's the difference between retext and alex?

wooorm commented 8 years ago

Great! alex is an example implementation of retext, with two plugins (retext-equality, retext-profanitities), and it also uses remark if you pass in markdown so it can ignore the syntax, does that sound logical?

localjo commented 8 years ago

Yep! To use retext with plugins of my choice, it sounds like I should start by requiring retext in my project. Thanks!

wooorm commented 8 years ago

Indeed. Have you also seen retext-readability? It combines this formula with some others to get reliable age levels, might be useful

localjo commented 8 years ago

Yeah, retext-readability is definitely the core of what I'm looking for at the moment. What's unified? Should I be using that directly? Or retext? Seems like Alex uses unified directly.

localjo commented 8 years ago

Sorry if this long thread of questions is in the wrong place. 🙈

wooorm commented 8 years ago

So retext, remark, and rehype are implementations of unified, but more importantly they’re ecosystems, and the actual packages with those names don’t need to be used to use the “umbrellas” themselves. A cool thing is that they can be combined, so you can transform from markdown to HTML, and meanwhile extract just the text content and pass that to retext plug-ins.

As alex doesn’t do anything with compilation, it uses the retext-english parser, and the remark-parse parser, that way, the project does not include stuff it doesn’t need.

If you’re not compiling, or if you want to use multiple systems together, I suggest using unified directly, but it’s a bit more advanced.

I think the best place to talk about this is the Gitter channel?!