aphp / edsnlp

Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.
https://aphp.github.io/edsnlp/
BSD 3-Clause "New" or "Revised" License
112 stars 29 forks source link

Feature request: Pollution #138

Closed aricohen93 closed 1 year ago

aricohen93 commented 1 year ago

Feature type

Add new patterns

Description

Add pattern to match text like :

2/2Pat : <NOM> <Prenom> le <date> IPP <ipp> Intitulé RCP : Urologie HMN le <date>

1/2Pat : <NOM> <Prenom> le <date> IPP <ipp> Intitulé RCP : Urologie HMN le <date>
percevalw commented 1 year ago

Hi @aricohen93, could you expand a bit more on why such a scheme would constitute pollution? It seems to me that there is a lot of useful information here!

aricohen93 commented 1 year ago

Could be, anyway it's a footer. And it is intercalated into the text.

So we have the following structure:

[TEXT] [FOOTER] [Continuation of preceding text]

percevalw commented 1 year ago

Completed by https://github.com/aphp/edsnlp/pull/139 and https://github.com/aphp/edsnlp/pull/150