NewcastleRSE / nhs_readability_react

ReactJS version of the NHS Document Readability project based on MUI-rte
1 stars 0 forks source link

PRISM word swaps - statements / alternative verb forms / plurals #23

Open nicoleahmed opened 2 years ago

nicoleahmed commented 2 years ago

Single PRISM words are highlighted fine Statements / multiple word swaps currently not highlighted e.g. - atopic dermatitis, myocardial infarction

Alternative Forms of words aren't captured: -verb forms / or verbs to adjectives- e.g. anticipate but not anticipatory

Page 41 of this document has the suggested swaps: https://www.nhlbi.nih.gov/files/docs/ghchs_readability_toolkit.pdf

Separately we are talking about: -what to call this analysis option "complex words with suggested swaps" -Adding another analysis option "long words" -how to capture things we should be added here - during user testing surveys and amongst ourselves.

rosselton commented 1 year ago

It might not be possible in the timeframe to write a function that handles multiple words. However, most phrases like 'congenital anomaly' could be split into single words with simper options. Where the words are definitely connected e.g. 'clinical trial' there could be an exceptions list which could be kept separately.

nicoleahmed commented 1 year ago

compound phrases could be highlighted as separate words as suggested by Becky

But this would impact on some of the word swaps

So for example

myocardial = heart muscle

infarction = loss of blood circulation to an area

but together: myocardial infarction = heart attack

so some of these compound phrases will need to be added to an exception list

nicoleahmed commented 1 year ago

so at the moment neither myocardial or infarction

or myocardial infarction

are highlighted by PRISM function

nicoleahmed commented 1 year ago

SHELL editor isn't using PRISM swaps explicitly. And also doesn't do compound phrases

e.g. "contrast medium" should go to "dye" but only medium is highlighted as multisyllabic word

nicoleahmed commented 1 year ago

Possible solutions:

compound phrases which must be together -> exception list word forms -> wildcards for regular expressions e.g. attain* -> attains , attained, attainment, word forms -> lookup to dictionary for word forms -> e.g. goose isn't gooses it's geese word forms -> also need English language alternatives -> e.g. fetal in USA but foetal in UK word forms -> need to separate out the word forms in PRISM to separate listings in our tool. e.g. currently we have "expedite, expeditious" as the search term which no one will ever search -> should be expedite then expedited then expedites then expeditious

Solution for first general testing release: keep function as it is, but rename from PRISM to complex words (because we aren't currently using the complete list but also because people may not know what PRISM means)

Make easy changes now - e.g. splitting things in brackets and slashes to separate terms

Testing possible solutions after first round of user feedback -> with review of processing speed of tool