ConnectionError: HTTPConnectionPool(host='localhost', port=8080) #13

Closed chinyii closed 3 years ago

chinyii commented 3 years ago

Hello, I am trying to use Openscoring as a web service to get APIs. I am very new to this so I do hope to get some help.

from openscoring import Openscoring

os = Openscoring(base_url = "http://localhost:8080/openscoring")

This part runs on Jupyter Notebook without issue, however attempting to go to the base url, results in the site not being able to be reached. The main error will be in the next line of code for deploying the pmml file.

os.deployFile("Traffic", "lr_model.pmml")

The full error message is as stated:

I sincerely apologize if this is a dumb issue, but I do not know why or what is wrong. I have also ran spark-shell --jars D:\spark-3.0.2-bin-hadoop2.7\bin\jpmml-sparkml-executable-${1.6.5}.jar with no issues. I am currently using Spark 3.0.2 and Python 3.8.8. Thank you for your time and considerations. Please do let me know if more information is required from me.

vruusmann commented 3 years ago

os = Openscoring(base_url = "http://localhost:8080/openscoring") This part runs on Jupyter Notebook without issue

This should always run OK, because the Openscoring constructor does not do any networking - it simply stores the base URL for making more specialized API calls.

os.deployFile("Traffic", "lr_model.pmml") attempting to go to the base url, results in the site not being able to be reached. The full error message is as stated: .. [WinError 10061] No connection could be made because the target machine actively refused it

Now you're about to perform the first network operation, but it fails, because you don't have an Openscoring server application running.

Are you sure you have Openscoring server running at http://localhost:8080/openscoring? What do you see if you copy&paste this URL to a web browser?

You're saying that you're working with a Jupyter notebook. Is it a local or remote Jupyter netbook? If it's a remote one, then http://localhost:8080/openscoring is relative to this remote server machine. You can't connect from remote machine to a local machine just like that.

vruusmann commented 3 years ago

If you didn't have Openscoring server running, then see here:

chinyii commented 3 years ago

If you didn't have Openscoring server running, then see here:

Oh dear, I thought all I had to do was just use it through Jupyter Notebook. I did the steps and launched the server. However, through Jupyter Notebook, I ran into another error. ValueError: User not authorized. I think it has something to do with the advanced configuration for Openscoring, but the system can't seem to find the application.conf file.

It produced this error when trying to run "java -Dconfig.file=application.conf -jar openscoring-server-executable-2.0.4.jar" I have included some screenshots for reference. Thank you once again for your time! I am so sorry about these issues, as I am very new to this.





Edit: Oh and this is what shows when I start the Openscoring server with defaults. This is what I got when going to http://localhost:8080/openscoring


vruusmann commented 3 years ago

It produced this error when trying to run java -Dconfig.file=application.conf -jar openscoring-server-executable-2.0.4.jar ValueError: User not authorized

You have Openscoring server running now, and your Jupyter notebook is able to connect to it, because this what you're seeing is a proper Openscoring server-generated error message!

This message - "User not authorized" - means that you're trying to invoke an admin action (here: deploying a new model) without having provided any admin credentials.

I have included some screenshots for reference.

I can see that you've edited the default application.conf, and tweaked access control section.

I can also see that the edited config file now contains a parse error - there is an unbalanced right curly brace character (}) at the end of networkSecurityConfigFilter section.

Perhaps the Openscoring server is loading the config file only partially, and therefore fails to grant admin access?

See the Openscoring server log file, which should contain more detailed information about the active configuration and any admin grant/deny decision.

Edit: Oh and this is what shows when I start the Openscoring server with defaults. Message: Not found

That counts as a success.

You're actually accessing a non-existent endpoint. There's nothing mapped to the root path (/openscoring/). Try adding a /model suffix to it (/openscoring/model/) to actually get the list of deployed models (should be an empty list initially).

vruusmann commented 3 years ago

If you're very new to this, then I'd suggest you to avoid editing the application.conf file.

Start the Openscoring server with the default (built-in) configuration, and see if you can invoke admin-level actions (deploy a model, undeploy a model) or not. IIRC, it should be possible in localhost mode.

chinyii commented 3 years ago

@vruusmann I changed the application.conf as you said about the curly brace and I realized why it does not detect the application.conf file. I had to add a .txt at the back as for some reason it could not recognize it with just application.conf. I think I had to do a application.conf as the default server just does not authorize me to deploy the file.

After successfully running with the application.conf file, the error evolved into another error in Jupyter Notebook. Every step forward seems to have an error, I am so sorry for causing you trouble :(

Just in case, I will show the pmml file that I generated to deploy for context.

vruusmann commented 3 years ago

ValueError: Required element PMML/@isScorable=true is not defined

The uploaded PMML document does not contain any model elements.

You're asking the Openscoring server to "deploy a model", but you're giving it a model-less (transformation only) PMML document.

Try getting started with some valid/model-ful PMML document. For example,

chinyii commented 3 years ago

@vruusmann I see, I think I understand now, my piplineModel is inadequate. Do you have any resources for building pipeline models for pyspark on Jupyter Notebook? I was able to evaluate the model on Jupyter Notebook, so I thought I could just pmml the pipeline model. I really appreciate all the help you have given!

Edit: I can see the end product, but I am not so sure about the process for DecisionTreeIris.pmml

Try getting started with some valid/model-ful PMML document. For example,

vruusmann commented 3 years ago

I was able to evaluate the model on Jupyter Notebook, so I thought I could just pmml the pipeline model.

Apache Spark ML pipelines may be transformation-only. Your pipeline was just performing data transformations, it did not learn anything from the transformed data, so it's kind a half-way done anyway.

Do you have any resources for building pipeline models for pyspark on Jupyter Notebook?

There's nothing special about PMML wrt PySpark and/or Jupyter Notebooks.

Any model-ful pipeline should be okay. Just take your existing pipeline, and append a DecisionTreeRegressor or DecisionTreeClassifier to it (depending on the nature of the label column).

Alternatively, see this: