Closed jeff1evesque closed 7 years ago
We were able to attempt a model_predict
session, on the web-interface, using a model premised from a collection of svm
datasets:
However, upon form submission, our flask.log
contained the following traceback:
[2017-07-26 07:59:00,986] {/usr/local/lib/python2.7/dist-packages/flask/app.py:1560} ERROR - Exception on /load-data [POST]
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1982, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1614, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1517, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1612, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1598, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/vagrant/interface/views.py", line 125, in load_data
response = loader.load_model_predict()
File "/vagrant/brain/load_data.py", line 178, in load_model_predict
session = ModelPredict(self.data)
File "/vagrant/brain/session/model_predict.py", line 54, in __init__
self.model_id = self.prediction_settings['model_id']
KeyError: 'model_id'
290203d: the current committed code, generates the following:
The corresponding error.log
:
[2017-07-28 11:36:36,536] {/vagrant/log/logger.py:165} DEBUG - brain.load_data: /brain/load-data.py, session: <brain.session.model_predict.ModelPredict object at 0x7fd63147f850>
[2017-07-28 11:36:36,536] {/vagrant/log/logger.py:165} DEBUG - brain.load_data: /brain/load-data.py, validate_arg_none: False
[2017-07-28 11:36:36,537] {/vagrant/log/logger.py:165} DEBUG - brain.load_data: /brain/load-data.py, session.get_errors(): []
[2017-07-28 11:36:36,538] {/vagrant/log/logger.py:165} DEBUG - brain.session.model_predict: /brain/session/model_predict.py, self.collection: u'collection-1136'
[2017-07-28 11:36:36,539] {/vagrant/log/logger.py:165} DEBUG - brain.session.model_predict: /brain/session/model_predict.py, self.predictors: [u'3', u'3', u'3', u'3', u'3', u'3', u'3']
[2017-07-28 11:36:36,546] {/vagrant/log/logger.py:165} DEBUG - brain.session.model_predict: /brain/session/model_predict.py, model_type: 'svm'
[2017-07-28 11:36:36,549] {/vagrant/log/logger.py:165} DEBUG - brain.load_data: /brain/load-data.py, my_prediction: {'model': 'svm', 'confidence': {'decision_function': [-5.9856473709828322, 7.3016894114551585, 3.4672599258030887], 'classes': [u'dep-variable-1', u'dep-variable-2', u'dep-variable-3'], 'probability': [0.011500715458176457, 0.042140046267255232, 0.94635923827456847]}, 'result': u'dep-variable-2', 'error': None}
We were able to run an svr
case of the above equivalent:
However, during a data_new
session, we are unable to load xml dataset(s), on the web-interface, for the svr
case. So, we'll need to investigate logic relating to xml2dict.py
:
The model_generate
case fails, when larger datasets are used, with the json
file format:
svm-1.json
svr-1.json
While the smaller json datasets succeed:
svm.json
svr.json
This means, we'll likely need to investigate accepting larger array instances:
...
{
"dependent-variable": "dep-variable-4",
"independent-variables": [{
"indep-variable-1": 22.1,
"indep-variable-2": 95.96,
"indep-variable-4": 342,
"indep-variable-5": 66.67,
"indep-variable-6": 0.001,
"indep-variable-7": 32,
"indep-variable-3": 0.743
},
{
"indep-variable-1": 20.71,
"indep-variable-2": 99.33,
"indep-variable-4": 342,
"indep-variable-5": 75.67,
"indep-variable-6": 0.001,
"indep-variable-7": 30,
"indep-variable-3": 0.648
}]
},
...
Instead of the single observation instance from the json dataset(s):
{
"dependent-variable": "dep-variable-1",
"independent-variables": [{
"indep-variable-1": 23.45,
"indep-variable-2": 98.01,
"indep-variable-4": 325,
"indep-variable-5": 54.64,
"indep-variable-6": 0.002,
"indep-variable-7": 23,
"indep-variable-3": 0.432
}]
},
Note: the programmatic-interface also implements the former, longer json dataset syntax, which currently has been failing. Solving the above problem, could likely fix the current travis ci builds.
Note: we may remove the gunicorn, and ngnix from our current travis build, since it likely is pointless, and redundant, given how the pytest-flask implements the live_server
. So, it would make sense to open a dedicated issue, to create a separate unit test, responsible for checking the configurations of the webserver, and reverse proxy settings, for any arbitrary application.
731b92a: we are leveraging the travis ci, by raising a ValueError
, since the RESTClient
firefox plugin, for the osx is buggy, and does not return a response body, when a post
request (with application/json
header) is sent. Alternative approaches, involve either using a windows host, or adding the certificate to the browser, for this corresponding application.
Our programmatic-interface, as well as current travis ci builds, have been been failing, because they are referencing datasets, according to the master
branch:
{
"properties": {
"session_name": "sample_svm_title",
"collection": "svm-424-5",
"dataset_type": "dataset_url",
"session_type": "data_new",
"model_type": "svm",
"stream": "True"
},
"dataset": [
"https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svm.json",
"https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svm-1.json"
]
}
However, based on changes worked from this issue (i.e. feature-2844
branch), either the master
branch needs to be updated, with datasets from the feature-2844
branch, or we'd have to (temporarily) reference the adjusted datasets:
{
"properties": {
"session_name": "sample_svm_title",
"collection": "svm-424-5",
"dataset_type": "dataset_url",
"session_type": "data_new",
"model_type": "svm",
"stream": "True"
},
"dataset": [
"https://raw.githubusercontent.com/jeff1evesque/machine-learning/33dbb0fa1e65b7ddb28a7d43919a7843d7f0236b/interface/static/data/json/web_interface/svm.json",
"https://raw.githubusercontent.com/jeff1evesque/machine-learning/33dbb0fa1e65b7ddb28a7d43919a7843d7f0236b/interface/static/data/json/web_interface/svm-1.json"
]
}
Note: this will likely mean that when this issue is initially merged, it will be failing. However, shortly after being merged, we can manually retrigger the travis ci build, to account for the adjusted master
branch.
90e6d38: we should investigate, whether the dataset
structure varies, defined within /brain/session/model/sv.py
, between the web-interface, and the programmatic-interface.
036bd3d: our logger
debug statement, may suggest that the restructure
method, is not properly defining the dataset
property, for the web-interface:
This is indicated, by the corresponding output from error.log
, when the above form is submitted:
[2017-08-01 21:21:03,613] {/vagrant/log/logger.py:165} DEBUG - brain.session.data.dataset: /brain/session/data/dataset.py, datasets: {'error': None, 'properties': {'stream': False, 'session_type': u'data_new', 'collection': u'collection-file-upload-7', 'dataset_type': u'dataset_url', 'model_type': u'svm', 'session_name': u'test', 'dataset[]': u'https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svm.json'}, 'dataset': None}
So, we'll need to take a closer look, between our /brain/converter/settings.py
, and /brain/session/data/dataset.py
, by adding appropriate debug logger
statements, to the former settings.py
.
We've verified that the web-interface behaves as expected. So, we'll proceed by reviewing the latest travis ci builds, to determine how to resolve current bugs for the programmatic-interface.
Note: the above is a statement, verifying that corresponding logic executed, without raising errors.
After #2842 is resolved, we need to determine the corresponding nosql data structure, and implement it respectively with our python backend logic.