AstraZeneca / KAZU

Fast, world class biomedical NER
https://AstraZeneca.github.io/KAZU/
Apache License 2.0
70 stars 7 forks source link

Explain running Kazu in a Notebook in Quickstart #19

Closed raylite closed 6 months ago

raylite commented 7 months ago

I am trying to run Kazu (documentation) example in a Notebook but it raised an error.

I got this error: ipykernel_launcher.py: error: unrecognized arguments: -f

Not sure where the arguments is being passed and what to do. See the attached images for error dump

Screenshot 2024-02-01 192156 Screenshot 2024-02-01 192254

EFord36 commented 7 months ago

Hi,

This appears to be the Jupyter Notebook interacting poorly with Hydra's behaviour. I assume you're using the code in the Quickstart docs?

if so, this is the appropriate code for a notebook - I've just tested it myself and got it working:

first cell:

from hydra import compose, initialize_config_dir
from hydra.utils import instantiate

from kazu.data.data import Document
from kazu.pipeline import Pipeline
from kazu.utils.constants import HYDRA_VERSION_BASE
from pathlib import Path
import os

# the hydra config is kept in the model pack
cdir = Path(os.environ["KAZU_MODEL_PACK"]).joinpath("conf")

def kazu_test():
    with initialize_config_dir(version_base=HYDRA_VERSION_BASE, config_dir=str(cdir)):
        cfg = compose(config_name="config")
    pipeline: Pipeline = instantiate(cfg.Pipeline)
    text = "EGFR mutations are often implicated in lung cancer"
    doc = Document.create_simple_document(text)
    pipeline([doc])
    print(f"{doc.sections[0].text}")

second cell:

kazu_test()

Let me know if that works for you or if you have further issues.

I'll think about how best we can add to the quickstart documentation so that notebook users are better supported. Thanks for opening the issue so we had the opportunity to improve this! And thanks for giving kazu a try 😃

raylite commented 7 months ago

Yes, thank you for your response, later yesterday I tried the example using hydra compose (same as you provided) and I can confirm it worked.

EFord36 commented 7 months ago

great! In that case, I think let's leave the ticket open until I've added documentation for others - but I'll rename the ticket to reflect that if it's ok?

EFord36 commented 7 months ago

Just to keep you updated, I've got a PR on our 'internal' version of the repo that resolves this, should be made public in the next release.