allenai / scispacy

A full spaCy pipeline and models for scientific/biomedical documents.
https://allenai.github.io/scispacy/
Apache License 2.0
1.72k stars 229 forks source link

Models are not being able to identify the labels. #527

Closed prantasaha107 closed 1 month ago

prantasaha107 commented 1 month ago

Here is brief description of what I did: text2="Albumin, a major protein in blood plasma, plays a vital role in maintaining oncotic pressure. Immunoglobulin G (IgG), an antibody, is involved in the immune response. Actin, a globular protein, forms the cytoskeleton of cells. Myosin, another motor protein, interacts with actin to generate muscle contraction. Hemoglobin, a tetrameric protein, transports oxygen in red blood cells."

nlp_en_ner_jnlpa_md= spacy.load("en_ner_jnlpba_md") docuemnt1=nlp_en_ner_jnlpa_md(text2)

Finding proteins - 1st iteration.

Finding proteins - 1st iteration.

proteins=[]

Iterate over entities in the whole document

for ent in document1.ents:
if ent.label_ in ["PROTEIN","RNA"]: proteins.append(ent.text) print(proteins)

proteins is a empty list.