STAT325-S24 / HistoryAmherstCollege

Text and analysis related to Williams S. Tyler's "History of Amherst College" (1873)
MIT License
0 stars 1 forks source link

hyphenated words #21

Open arogers24 opened 5 months ago

arogers24 commented 5 months ago

How should we check for the edge case where words are supposed to be hyphenated?

We discussed having a library of hyphenated words. We check if the word is in the library and keep the hyphen when necessary.

nicholasjhorton commented 5 months ago

This is a great question (related to #2).

See https://helpingwithwriting.com/Lists/Hyphenated-Words.htm and

Screenshot 2024-03-26 at 3 10 56 PM

Let's defer for now.

nicholasjhorton commented 5 months ago

I thought that this was a great observation.

It should be easy to pull out the beginning parts of the hyphenated words and summarize them to see if we observe some that are likely hyphenated.

Let's discuss this at our standup on Thursday.

Casey308 commented 5 months ago

I think keeping a table somehow of the words that got combined by the function and then just checking it by hand could be a solution. Would this be too hard to implement?

FranciscoJFM02 commented 5 months ago

A solution to this could be finding cases where this is true and appending the hyphenated word to its first part in the line above so that it does not trigger the dehyphenate function. (Can be done later down the road).

nicholasjhorton commented 5 months ago

I'm reopening this because it's not done (just deferred until later).

nicholasjhorton commented 5 months ago

My thought is that one could search for WORD- and see if WORD is a word (perhaps by a visual scan).

I'm putting this back into the backlog for now...