asyml / ForteHealth

The project is in the incubation stage and still under development. ForteHealth is a flexible and powerful ML workflow builder for biomedical and clinical scenarios. This is part of the CASL project: http://casl-project.ai/
Apache License 2.0
10 stars 5 forks source link

Create an example for building bio NER pipeline #63

Open Leolty opened 2 years ago

Leolty commented 2 years ago

Describe the solution you'd like

In ForteHealth, we incorporate ScispaCy for bio ner annotation, I think, as the very first example, we can simply create a pipeline for bio NER annotation. The demo from scispacy is here

In scispacy, with model _en_ner_bc5cdrmd, we can annotate Disease and Chemical, with model _en_ner_bionlp13cgmd, we can annotate Cancer, Organ, etc. We can also show this by using different configuration to build the pipeline.

Possible included componets:

  1. Sentence Segementor
  2. Tokenizer
  3. Bio NER Tagger