Open mylibrar opened 2 years ago
This is not specifically with regards to remote processing of stave, but general working of stave with our pipeline outputs. I was just testing out stave with some json annotated output. I uploaded the base annotations first (which isn't base_ontology.json but a customized ontology json, where in I just had DrugLookupWindowAnnotation
, NegationContext
, UMLSConceptLink
, MedicalEntityMention
and MedicalArticle
annotations. However, when I upload the data pack json output in the next page, I see only DrugLookupWindowAnnotation
and NegationContext
in the left pane/annotation legends (output does have MedicalEntityMention
annotations). I am not sure why that's happening.
There are two places that could affect this, we know that this is not very smooth right now:
base_ontology
to populate an internal ontology if not provided. So most likely it didn't take all the new entry types you defined.As for 1, right now we are working on some new mechanism to populate the ontology, in the future, we should have all "used" entry types, but not just base_ontology
. But this is not yet implemented.
As for 2, @mylibrar may know some details. I forget how we define the default configs.
Quick fix for 1 is to feed an ontology file to Pipeline at the begining. Note that it need to be a "full" ontology file, this can be obtained by using the "merged" option here: https://asyml-forte.readthedocs.io/en/latest/toc/ontology_generation.html#ontology-generation-steps
Thanks @hunterhector. To clarify, I am currently just using an ontology json file and the output json file to work with a new Stave project. When I check the output json file, I do see new entry type annotations being detected which means pipeline did work with expected ontologies; however, when I upload the output json file to stave project, I don't see the new types in the left pane. Also, since we aren't using StaveProcessor in the pipeline directly for this example, but using json files to Stave, I feel 2nd point can't be the culprit either? Not sure if I am understanding this right
Thanks for the feedback. Could you provide the ontology.json and datapack.json that you used for testing? I'll try to replicate this issue on my side.
I do see new entry type annotations being detected which means the pipeline did work with expected ontologies;
So the output json will definitely work that the pipeline works fine, but that doesn't mean the pipeline pass all the types to stave, you need to check this variable in particular (located in pipeline resources): https://github.com/asyml/forte/blob/master/forte/processors/stave/stave_processor.py#L152
Actually, I am pretty sure that if you didn't pass your ontology through Pipeline
initialization, Stave won't be able to pick it up.
Thanks for the feedback. Could you provide the ontology.json and datapack.json that you used for testing? I'll try to replicate this issue on my side.
Onto_temp is the ontology json file, the other file is the output datapack onto_temp.json.zip 211230762558809616268911394647227693964.json.zip
I think the reason why you can only see DrugLookupWindowAnnotation
and NegationContext
in the left pane is that stave
is skipping all the Generics
entries when generating the legends. Actually this is an expected behavior because it's hard for stave
to render the Generics
type. So if you want stave
to display all the ontology definitions, maybe you can reconsider the ontology designing and use more Annotation
s and Link
s.
@mylibrar On a completely different note, I was able to start stave server without any problem until today. I then reinstalled it from source and now it throws 500s whenever I try to access my localhost. I am sure its some small thing that I am missing but I tried some stuff out and wasnt able to resolve it. Please advise. Following is the stack trace for the issue:
(base) piyush@Piyushs-Air simple-backend % stave start -o -n 8880
For more details, check out log file at /Users/piyush/.stave/log.
Starting Stave server at http://localhost:8880.
Quit the server with CONTROL-C.
[ERROR] 2022-04-26 17:58:32,394 web Uncaught exception GET / (::1)
HTTPServerRequest(protocol='http', host='localhost:8880', method='GET', uri='/', version='HTTP/1.1', remote_ip='::1')
Traceback (most recent call last):
File "/Users/piyush/miniforge3/lib/python3.9/site-packages/tornado/web.py", line 1702, in _execute
result = method(*self.path_args, **self.path_kwargs)
File "/Users/piyush/casl/stave/simple-backend/stave_backend/lib/stave_viewer.py", line 214, in get
self.render(self._build_path + "/index.html")
File "/Users/piyush/miniforge3/lib/python3.9/site-packages/tornado/web.py", line 863, in render
html = self.render_string(template_name, **kwargs)
File "/Users/piyush/miniforge3/lib/python3.9/site-packages/tornado/web.py", line 1009, in render_string
t = loader.load(template_name)
File "/Users/piyush/miniforge3/lib/python3.9/site-packages/tornado/template.py", line 446, in load
self.templates[name] = self._create_template(name)
File "/Users/piyush/miniforge3/lib/python3.9/site-packages/tornado/template.py", line 477, in _create_template
with open(path, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/piyush/casl/stave/simple-backend/stave_backend/lib/build/index.html'
[ERROR] 2022-04-26 17:58:32,398 web 500 GET / (::1) 4.77ms
The recommended method to install stave
is pip install stave==0.0.2.dev1
. Right now installing stave
directly from source might not work because there's actually some extra steps to build the package:
cd
into stave/
yarn build
to generate a bundle folder called ./build
./build
folder into ./simple-backend/stave_backend/lib/
pip install -e ./
You can refer to the workflow here to see what steps are required for packaging.
@mylibrar Thanks for that.
Is it at all possible to run the current clinical_processing_pipeline.py
i.e. with the present code? I followed README and I landed up on the query_chatbot.json showing up in the annotation viewer instead of the dialogue like interface. Like so
@mylibrar Thanks for that. Is it at all possible to run the current
clinical_processing_pipeline.py
i.e. with the present code? I followed README and I landed up on the query_chatbot.json showing up in the annotation viewer instead of the dialogue like interface. Like so
Nevermind, resolved this
I was trying out the following design to get the skeleton working and just thought I would share it with you. Please advise or suggest if you have anything in mind. The code is in works,
The overall design looks good to me. One thing to note is that PL1 is already implemented by stave
. Any stave
project with its layout configured to DialogueBox
should be able to load this pipeline. So you only need to implement PL2 and make sure it's compatible with PL1. You can also use the Eliza
project in stave
to test your remote service (refer to this tutorial for details).
@mylibrar I realized that all string attributes in the annotations are truncated after 3 characters. Is that configurable in stave?
@mylibrar I am done with all the other changes, and was trying to refactor utterance_searcher.py with the latest APIs. However, I see that the API calls aren't going through from the utterance_searcher.py file. On the other hand, when I use the same APIs in the clinical_processing_pipeline.py file, they work just fine. When trying to call them from utterance_searcher.py, I can see in the browser, that the API call goes on indefinitely and the page just becomes unresponsive. This happens with any call that I try, be it login, create document, etc.
The control flow gets stuck at Line 90, printing the print statement above but not below the login API. I am clueless as to what might be happening here.
@mylibrar I am done with all the other changes, and was trying to refactor utterance_searcher.py with the latest APIs. However, I see that the API calls aren't going through from the utterance_searcher.py file. On the other hand, when I use the same APIs in the clinical_processing_pipeline.py file, they work just fine. When trying to call them from utterance_searcher.py, I can see in the browser, that the API call goes on indefinitely and the page just becomes unresponsive. This happens with any call that I try, be it login, create document, etc.
The control flow gets stuck at Line 90, printing the print statement above but not below the login API. I am clueless as to what might be happening here.
My guess is that a deadlock occurs when the remote service invoked by Stave
backend posts a request back to Stave
. Django
is a synchronous framework which means it can only take one request at a time. We can update the backend using some async mechanisms but it's nontrivial. Right now I guess a better solution is to directly interact with the sqlite database just as before using sqlite_insert and we can consider updating the implementation in future.
Is your feature request related to a problem? Please describe. The current clinical_pipeline example needs to be fixed according to the latest updates in
stave
of how to handle remote pipeline service.stave
will no longer assemble an NLP pipeline itself. Instead,stave
now uses a pipeline withRemoteProcessor
to call remote service.Describe the solution you'd like Some of the steps in README.md might not work anymore. Hence we need to improve the example to make it solid:
mimiciii/1.4/NOTEEVENTS.csv.gz
. For example, https://physionet.org/content/mimiciii-demo/1.4/D_ICD_DIAGNOSES.csv.LastUtteranceSearcher
. It should be similar to this tutorial. The instruction will no longer require users to define the environment variables since they should be already configured in the service pipeline, but we still need to include the steps of starting a remote service.stave
database, and this is not a safe operation. We should utilize backend APIs to interact withstave
database. A wrapper of these APIs is provided instave
backend lib.stave
projects and to manually configure the database path. With the new interfaces provided bystave
, we can automate these steps in a python script. Ideally users only need to run a python script, which should pop out a chatbot window for users to enter queries.