PAIR-code / lit

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
https://pair-code.github.io/lit
Apache License 2.0
3.46k stars 352 forks source link

No widget displays when use LIT in SageMaker (Notebook Instance) #735

Open superhez opened 2 years ago

superhez commented 2 years ago

Hi! I an trying the demo below in both my local jupyterlab and AWS SageMaker (Notebook Instance). Everything goes well in my local jupyter, but in SageMaker, no widget displays and the cell's output shows a server connection timeout error. I have confirmed the version of related packages are almost same in the two environments. Could anyone give me some advice to deal with the problem?

DEMO: https://colab.research.google.com/github/PAIR-code/lit/blob/main/lit_nlp/examples/notebooks/LIT_sentiment_classifier.ipynb

Browser: Microsoft Edge 101.0.1210.53

■SageMaker notebook instance Python==3.7.12 jupyterlab==1.2.21 tensorflow==2.6.2 lit_nlp==0.4.1 tfds-nightly==4.5.2.dev202202230044 transformers==4.1.1 tensorflow-datasets== 4.5.2

※No VPC

■Kernel: conda_python3 Python==3.6.13 jupyterlab==1.2.21 tensorflow==2.6.2 lit_nlp==0.4.1 tfds-nightly==4.5.2.dev202202230044 transformers==4.1.1 tensorflow-datasets== 4.5.2

■My local Jupyterlab(for reference) Python==3.7.12 jupyterlab==1.2.6 or 3.3.2 (both OK) tensorflow==2.6.2 lit_nlp==0.4.1 tfds-nightly==4.5.2.dev202202230044 transformers==4.1.1 tensorflow-datasets== 4.5.2

jameswex commented 2 years ago

We have not tried LIT in SageMaker before, so it's not surprising that there might be some issues. Perhaps it could be related to a proxy server being used by the notebook instance. The LitWidget class does accept a proxy URL as a constructor argument for environments where this is the case. That was an issue for using LIT in Google Vertex AI Workbench notebooks, and in that case the proxy URL being set to "/proxy/%PORT%/" solves the issue. Let me know if you are able to make progress.

superhez commented 2 years ago

Hello, jameswex. Thanks so much for your reply! Following with your advice, I tried "/proxy/%PORT%/" and the LIT's GUI now can be displayed!

However, a new issue comes out now that only data list is shown in the GUI, but there is no classification result and the realtime prediction does not work. An error says "Uncaught error: 'NoneType' object is not subscriptable (and 3 other errors)" at the bottom of GUI. When I pressed F12 to check the browser's network activities, bad requirements named as "get_preds?..." (URL: https://MY_SAGEMAKER_URL/PROXY/PORT/get_preds?model=sst_tiny...) gave a status code "500 Internal Server Error".

While, in my local jupyterlab, everything goes well when the same operation connect with "http://localhost:PORT/get_preds?..." So, it seems that when the GUI try to fetch the prediction results it accesses a wrong URL. I am not sure about that. Could you kindly give me some further advice?

We have not tried LIT in SageMaker before, so it's not surprising that there might be some issues. Perhaps it could be related to a proxy server being used by the notebook instance. The LitWidget class does accept a proxy URL as a constructor argument for environments where this is the case. That was an issue for using LIT in Google Vertex AI Workbench notebooks, and in that case the proxy URL being set to "/proxy/%PORT%/" solves the issue. Let me know if you are able to make progress.

jameswex commented 2 years ago

It's not simple for me to get an AWS instance to test with. Would you be willing to try this in your notebook:

import pandas as pd
pd.DataFrame(list(models['sst_tiny'].predict(datasets['sst_dev']._examples[0:2])))

This will verify if the model can successfully call predict on the dataset outside the scope of the app making calls to the backend. On success, you will see a dataframe with two rows of data from the results of the model on the first two datapoints.

superhez commented 2 years ago

It's not simple for me to get an AWS instance to test with. Would you be willing to try this in your notebook:

import pandas as pd
pd.DataFrame(list(models['sst_tiny'].predict(datasets['sst_dev']._examples[0:2])))

This will verify if the model can successfully call predict on the dataset outside the scope of the app making calls to the backend. On success, you will see a dataframe with two rows of data from the results of the model on the first two datapoints.

Thanks for your kind reply. I added the code and the first two datapoints can be successfully listed with a series of columns named as "cls_emb", "input_embs", "layer_0/avg_emb"... "probas"... So that means there is no problem with reading the data or calling the prediction, the problem could be in the front end with the GUI, right? I thought everything had become OK after the GUI got successfully displayed, but the prediction did not work there, emmm...

jameswex commented 2 years ago

An error says "Uncaught error: 'NoneType' object is not subscriptable (and 3 other errors)" at the bottom of GUI.

If you click that error text at the bottom of the screen, does the dialog box show a more detailed error? If so, please paste here. Thanks.

superhez commented 2 years ago

An error says "Uncaught error: 'NoneType' object is not subscriptable (and 3 other errors)" at the bottom of GUI.

If you click that error text at the bottom of the screen, does the dialog box show a more detailed error? If so, please paste here. Thanks.

Here are the"Error Details" below. Thank you.

Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 182, in _get_preds preds = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 305, in _get_interpretations model_outputs = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 182, in _get_preds preds = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 182, in _get_preds preds = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 305, in _get_interpretations model_outputs = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 305, in _get_interpretations model_outputs = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 182, in _get_preds preds = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable

jameswex commented 2 years ago

Thanks! That seems to suggest that the payload in the get_preds HTTP Post request from the front-end to the backend, when the server is launched inside of SageMaker, is losing its request payload (data), which should be a dict that contains the datapoint IDs to get predictions for. I wonder if something about SageMaker's security model is affecting the performance of the LIT webserver running inside of it.

superhez commented 2 years ago

Thanks! That seems to suggest that the payload in the get_preds HTTP Post request from the front-end to the backend, when the server is launched inside of SageMaker, is losing its request payload (data), which should be a dict that contains the datapoint IDs to get predictions for. I wonder if something about SageMaker's security model is affecting the performance of the LIT webserver running inside of it.

Thank you so much for the problem identification. With your information, I am now contacting the support center on the AWS side to checkout if some possible setting of SageMaker can help to solve the issue.

jameswex commented 2 years ago

Keep me updated on anything they say back. Hopefully there's some change we can make in LIT to support SageMaker, if we've identified the true problem here.

pri2si17-1997 commented 1 year ago

Hi @jameswex @superhez were you able to run the demo in SageMaker notebook instance or SageMaker studio instance?