jeff1evesque / machine-learning

Web-interface + rest API for classification and regression (https://jeff1evesque.github.io/machine-learning.docs)
Other
256 stars 85 forks source link

Ensure 'data_*.py' parses svr data into database #2586

Closed jeff1evesque closed 8 years ago

jeff1evesque commented 8 years ago

We need to ensure SVR datasets are properly parsed into the db_machine_learning database for the following two cases:

This enhancement, needs to preside for both the web-interface, as well as the programmatic api. Also, #2587 needs to be resolved prior to this issue.

jeff1evesque commented 8 years ago

We need to adjust the following modules:

jeff1evesque commented 8 years ago

Each converter method within convert_dataset.py, needs to check for the following dataset types, in order to properly implement imported [svm|svr]_[csv|json|xml]_converter modules:

Additionally, we need to add the model_type attribute for both the data_new, and data_append json sample dataset files. So, we'll need to adjust our README.md to note some of these changes.

jeff1evesque commented 8 years ago

Currently, an implementation of the web-interface (json), generates the following traceback:

...
/vagrant/brain/database/db_query.py:119: Warning: Field 'model_type' doesn't hav
e a default value
  self.cursor.execute(sql_statement, sql_args)
{u'dataset': [{u'criterion': 34.543, u'predictors': {u'predictor-1': 23.45, u'pr
edictor-2': 98.01, u'predictor-3': 0.432, u'predictor-4': 325, u'predictor-5': 5
4.64, u'predictor-6': 0.002, u'predictor-7': 25}}, {u'criterion': 54.666, u'pred
ictors': {u'predictor-1': 24.32, u'predictor-2': 92.22, u'predictor-3': 0.356, u
'predictor-4': 235, u'predictor-5': 64.45, u'predictor-6': 0.001, u'predictor-7'
: 31}}, {u'criterion': 22.02, u'predictors': {u'predictor-1': 22.67, u'predictor
-2': 101.21, u'predictor-3': 0.832, u'predictor-4': 427, u'predictor-5': 75.45,
u'predictor-6': 0.002, u'predictor-7': 24}}]}
[2016-06-12 14:39:00,876] ERROR in app: Exception on /load-data/ [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1988, in wsgi
_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1641, in full
_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1544, in hand
le_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1639, in full
_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1625, in disp
atch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/vagrant/interface/views.py", line 101, in load_data
    response = loader.load_data_new()
  File "/vagrant/brain/load_data.py", line 70, in load_data_new
    session.dataset_to_dict(session_id)
  File "/vagrant/brain/session/base_data.py", line 177, in dataset_to_dict
    if response['error']:
TypeError: 'bool' object has no attribute '__getitem__'

An svm dataset, has the following structure:

{u'dep-variable-1': [{u'indep-variable-6': 0.002, u'indep-variable-7': 25, u'ind
ep-variable-4': 325, u'indep-variable-5': 54.64, u'indep-variable-2': 98.01, u'i
ndep-variable-3': 0.432, u'indep-variable-1': 23.45}], u'dep-variable-3': [{u'in
dep-variable-6': 0.002, u'indep-variable-7': 24, u'indep-variable-4': 427, u'ind
ep-variable-5': 75.45, u'indep-variable-2': 101.21, u'indep-variable-3': 0.832,
u'indep-variable-1': 22.67}], u'dep-variable-2': [{u'indep-variable-6': 0.001, u
'indep-variable-7': 31, u'indep-variable-4': 235, u'indep-variable-5': 64.45, u'
indep-variable-2': 92.22, u'indep-variable-3': 0.356, u'indep-variable-1': 24.32
}]}
jeff1evesque commented 8 years ago

We truncated all tables from db_machine_learning, then performed a Data New session through the web browser. For the following tables:

MariaDB [db_machine_learning]> show tables;
+-------------------------------+
| Tables_in_db_machine_learning |
+-------------------------------+
| tbl_dataset_entity            |
| tbl_feature_count             |
| tbl_feature_value             |
| tbl_model_type                |
| tbl_observation_label         |
+-------------------------------+

Two tables had bad data inserted, and one did not have anything, a result of the truncating all tables within the corresoonding database. The following are incorrect tables (first table should have count_features: 7):

MariaDB [db_machine_learning]> select * from tbl_feature_count;
+---------+-----------+----------------+
| id_size | id_entity | count_features |
+---------+-----------+----------------+
|       1 |         1 |              2 |
+---------+-----------+----------------+
1 row in set (0.00 sec)

MariaDB [db_machine_learning]> select * from tbl_observation_label;
+----------+-----------+--------------------+
| id_label | id_entity | dep_variable_label |
+----------+-----------+--------------------+
|        1 |         1 | 2                  |
|        2 |         1 | 4                  |
|        3 |         1 | .                  |
|        4 |         1 | 5                  |
|        5 |         1 | 3                  |
|        6 |         1 | 1                  |
|        7 |         1 | 0                  |
|        8 |         1 | 9                  |
|        9 |         1 | 6                  |
+----------+-----------+--------------------+
9 rows in set (0.00 sec)

As stated above, the following table is empty as a result of the truncation operation:

MariaDB [db_machine_learning]> select * from tbl_model_type;
Empty set (0.00 sec)
jeff1evesque commented 8 years ago

Since the csv file structure is identical between the classification, and regression model types, there are three cases that remain for this overall issue:

jeff1evesque commented 8 years ago

We are receiving the following warning during a Data New session using an xml svr sample dataset:

/vagrant/brain/database/db_query.py:119: Warning: Data truncated for column 'dep
_variable_label' at row 1
  self.cursor.execute(sql_statement, sql_args)

Therefore, we may need to adjust setup_tables.py.

jeff1evesque commented 8 years ago

2c2b4d0: resolves the truncated warning.

Now, we need to verify the programmatic api json converter, which can be done initially by manually running our unit tests.

jeff1evesque commented 8 years ago

We recursively ensured proper line endings from within the vagrant vm:

vagrant@vagrant-ubuntu-trusty-64:/vagrant$ sudo find /vagrant -type f -exec dos2unix {} \;

However, we are receiving the following traceback, corresponding to each assert svr unit test:

vagrant@vagrant-ubuntu-trusty-64:/vagrant/test$ py.test
============================= test session starts ==============================

platform linux2 -- Python 2.7.6, pytest-2.9.1, py-1.4.31, pluggy-0.3.1
rootdir: /vagrant/test, inifile: pytest.ini
collected 8 items

programmatic_interface/pytest_svm_session.py FF..
programmatic_interface/pytest_svr_session.py FF..

=================================== FAILURES ===================================

________________________________ check_data_new ________________________________

    def check_data_new():
        """@check_data_new

        This method tests the 'data_new' session.

        """

>       assert requests.post(
            endpoint_url,
            headers=headers,
            data=get_sample_json('svm-data-new.json', 'svm')
        )
E       assert <Response [500]>
E        +  where <Response [500]> = <function post at 0x7feb7e57dde8>('http://l
ocalhost:5000/load-data/', headers={'Content-Type': 'application/json'}, data='{
"properties": {"session_name": "sample_title", "dataset_type": "json_string", "s
ession_type": "data_new"}, "dataset"...": 427, "indep-variable-5": 75.45, "indep
-variable-2": 101.21, "indep-variable-3": 0.832, "indep-variable-1": 22.67}}}')
E        +    where <function post at 0x7feb7e57dde8> = requests.post
E        +    and   '{"properties": {"session_name": "sample_title", "dataset_ty
pe": "json_string", "session_type": "data_new"}, "dataset"...": 427, "indep-vari
able-5": 75.45, "indep-variable-2": 101.21, "indep-variable-3": 0.832, "indep-va
riable-1": 22.67}}}' = get_sample_json('svm-data-new.json', 'svm')

programmatic_interface/pytest_svm_session.py:69: AssertionError
______________________________ check_data_append _______________________________

    def check_data_append():
        """@check_data_append

        This method tests the 'data_append' session.

        """

>       assert requests.post(
            endpoint_url,
            headers=headers,
            data=get_sample_json('svm-data-append.json', 'svm')
        )
E       assert <Response [500]>
E        +  where <Response [500]> = <function post at 0x7feb7e57dde8>('http://l
ocalhost:5000/load-data/', headers={'Content-Type': 'application/json'}, data='{
"properties": {"session_name": "sample_title", "dataset_type": "json_string", "s
ession_id": "1", "session_type": "da...": 235, "indep-variable-5": 64.45, "indep
-variable-2": 92.22, "indep-variable-3": 0.356, "indep-variable-1": 24.32}]}}')
E        +    where <function post at 0x7feb7e57dde8> = requests.post
E        +    and   '{"properties": {"session_name": "sample_title", "dataset_ty
pe": "json_string", "session_id": "1", "session_type": "da...": 235, "indep-vari
able-5": 64.45, "indep-variable-2": 92.22, "indep-variable-3": 0.356, "indep-var
iable-1": 24.32}]}}' = get_sample_json('svm-data-append.json', 'svm')

programmatic_interface/pytest_svm_session.py:83: AssertionError
________________________________ check_data_new ________________________________

    def check_data_new():
        """@check_data_new

        This method tests the 'data_new' session.

        """

>       assert requests.post(
            endpoint_url,
            headers=headers,
            data=get_sample_json('svr-data-new.json', 'svr')
        )
E       assert <Response [500]>
E        +  where <Response [500]> = <function post at 0x7feb7e57dde8>('http://l
ocalhost:5000/load-data/', headers={'Content-Type': 'application/json'}, data='{
"properties": {"session_name": "sample_title", "dataset_type": "json_string", "s
ession_type": "data_new"}, "dataset"...": 101.21, "predictor-3": 0.832, "predict
or-4": 427, "predictor-5": 75.45, "predictor-6": 0.002, "predictor-7": 26}}]}')
E        +    where <function post at 0x7feb7e57dde8> = requests.post
E        +    and   '{"properties": {"session_name": "sample_title", "dataset_ty
pe": "json_string", "session_type": "data_new"}, "dataset"...": 101.21, "predict
or-3": 0.832, "predictor-4": 427, "predictor-5": 75.45, "predictor-6": 0.002, "p
redictor-7": 26}}]}' = get_sample_json('svr-data-new.json', 'svr')

programmatic_interface/pytest_svr_session.py:69: AssertionError
______________________________ check_data_append _______________________________

    def check_data_append():
        """@check_data_append

        This method tests the 'data_append' session.

        """

>       assert requests.post(
            endpoint_url,
            headers=headers,
            data=get_sample_json('svr-data-append.json', 'svr')
        )
E       assert <Response [500]>
E        +  where <Response [500]> = <function post at 0x7feb7e57dde8>('http://l
ocalhost:5000/load-data/', headers={'Content-Type': 'application/json'}, data='{
"properties": {"session_name": "sample_title", "dataset_type": "json_string", "s
ession_id": "1", "session_type": "da...": 101.21, "predictor-3": 0.832, "predict
or-4": 427, "predictor-5": 75.45, "predictor-6": 0.002, "predictor-7": 24}}]}')
E        +    where <function post at 0x7feb7e57dde8> = requests.post
E        +    and   '{"properties": {"session_name": "sample_title", "dataset_ty
pe": "json_string", "session_id": "1", "session_type": "da...": 101.21, "predict
or-3": 0.832, "predictor-4": 427, "predictor-5": 75.45, "predictor-6": 0.002, "p
redictor-7": 24}}]}' = get_sample_json('svr-data-append.json', 'svr')

programmatic_interface/pytest_svr_session.py:83: AssertionError
====================== 4 failed, 4 passed in 0.83 seconds ======================

Note: after failing our above unit test(s), app.py would no longer be running. So, unless it is manually restarted (i.e. (cd /vagrant && python app.py)), all successive tests will fail.

jeff1evesque commented 8 years ago

Our flask.log will have the following entry, to correspond to the above comment:

...
[2016-06-15 22:02:24,798] {/usr/local/lib/python2.7/dist-packages/werkzeug/_internal.py:87} INFO -  * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
[2016-06-15 22:02:30,873] {/usr/local/lib/python2.7/dist-packages/flask/app.py:1587} ERROR - Exception on /load-data/ [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1988, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1641, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1544, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/vagrant/interface/views.py", line 64, in load_data
    response = loader.load_data_new()
  File "/vagrant/brain/load_data.py", line 70, in load_data_new
    session.dataset_to_dict(session_id)
  File "/vagrant/brain/session/base_data.py", line 171, in dataset_to_dict
    model_type = self.premodel_data['data']['settings']['model_type']
KeyError: 'model_type'
[2016-06-15 22:02:30,902] {/usr/local/lib/python2.7/dist-packages/werkzeug/_internal.py:87} INFO - 127.0.0.1 - - [15/Jun/2016 22:02:30] "POST /load-data/ HTTP/1.1" 500 -
[2016-06-15 22:02:30,948] {/usr/local/lib/python2.7/dist-packages/flask/app.py:1587} ERROR - Exception on /load-data/ [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1988, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1641, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1544, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/vagrant/interface/views.py", line 66, in load_data
    response = loader.load_data_append()
  File "/vagrant/brain/load_data.py", line 111, in load_data_append
    session.dataset_to_dict(session_id)
  File "/vagrant/brain/session/base_data.py", line 171, in dataset_to_dict
    model_type = self.premodel_data['data']['settings']['model_type']
KeyError: 'model_type'
[2016-06-15 22:02:30,955] {/usr/local/lib/python2.7/dist-packages/werkzeug/_internal.py:87} INFO - 127.0.0.1 - - [15/Jun/2016 22:02:30] "POST /load-data/ HTTP/1.1" 500 -
[2016-06-15 22:02:31,008] {/usr/local/lib/python2.7/dist-packages/werkzeug/_internal.py:87} INFO - 127.0.0.1 - - [15/Jun/2016 22:02:31] "POST /load-data/ HTTP/1.1" 200 -
[2016-06-15 22:02:31,020] {/usr/local/lib/python2.7/dist-packages/werkzeug/_internal.py:87} INFO - 127.0.0.1 - - [15/Jun/2016 22:02:31] "POST /load-data/ HTTP/1.1" 200 -
[2016-06-15 22:02:31,037] {/usr/local/lib/python2.7/dist-packages/flask/app.py:1587} ERROR - Exception on /load-data/ [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1988, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1641, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1544, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/vagrant/interface/views.py", line 64, in load_data
    response = loader.load_data_new()
  File "/vagrant/brain/load_data.py", line 70, in load_data_new
    session.dataset_to_dict(session_id)
  File "/vagrant/brain/session/base_data.py", line 171, in dataset_to_dict
    model_type = self.premodel_data['data']['settings']['model_type']
KeyError: 'model_type'
[2016-06-15 22:02:31,044] {/usr/local/lib/python2.7/dist-packages/werkzeug/_internal.py:87} INFO - 127.0.0.1 - - [15/Jun/2016 22:02:31] "POST /load-data/ HTTP/1.1" 500 -
[2016-06-15 22:02:31,081] {/usr/local/lib/python2.7/dist-packages/flask/app.py:1587} ERROR - Exception on /load-data/ [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1988, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1641, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1544, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/vagrant/interface/views.py", line 66, in load_data
    response = loader.load_data_append()
  File "/vagrant/brain/load_data.py", line 111, in load_data_append
    session.dataset_to_dict(session_id)
  File "/vagrant/brain/session/base_data.py", line 171, in dataset_to_dict
    model_type = self.premodel_data['data']['settings']['model_type']
KeyError: 'model_type'
...
jeff1evesque commented 8 years ago

Currently, the unit tests contained within pytest_svr_session.py succeeds. However, pytest_svm_session.py generates traceback of the following form:

...
_____________________________ check_model_predict ______________________________

    def check_model_predict():
        """@check_model_predict

        This method tests the 'model_predict' session.

        """

>       assert requests.post(
            endpoint_url,
            headers=headers,
            data=get_sample_json('svm-model-predict.json', 'svm')
        )

programmatic_interface/pytest_svm_session.py:111:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

/usr/lib/python2.7/dist-packages/requests/api.py:88: in post
    return request('post', url, data=data, **kwargs)
/usr/lib/python2.7/dist-packages/requests/api.py:44: in request
    return session.request(method=method, url=url, **kwargs)
/usr/lib/python2.7/dist-packages/requests/sessions.py:455: in request
    resp = self.send(prep, **send_kwargs)
/usr/lib/python2.7/dist-packages/requests/sessions.py:558: in send
    r = adapter.send(request, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <requests.adapters.HTTPAdapter object at 0x7ff87a8ec490>
request = <PreparedRequest [POST]>, stream = False
timeout = <urllib3.util.Timeout object at 0x7ff87a8ecd10>, verify = True
cert = None, proxies = OrderedDict()

    def send(self, request, stream=False, timeout=None, verify=True, cert=None,
proxies=None):
        """Sends PreparedRequest object. Returns Response object.

            :param request: The :class:`PreparedRequest <PreparedRequest>` being
 sent.
            :param stream: (optional) Whether to stream the request content.
            :param timeout: (optional) The timeout on the request.
            :param verify: (optional) Whether to verify SSL certificates.
            :param cert: (optional) Any user-provided SSL certificate to be trus
ted.
            :param proxies: (optional) The proxies dictionary to apply to the re
quest.
            """

        conn = self.get_connection(request.url, proxies)

        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(request)

        chunked = not (request.body is None or 'Content-Length' in request.heade
rs)

        if stream:
            timeout = TimeoutSauce(connect=timeout)
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)

        try:
            if not chunked:
                resp = conn.urlopen(
                    method=request.method,
                    url=url,
                    body=request.body,
                    headers=request.headers,
                    redirect=False,
                    assert_same_host=False,
                    preload_content=False,
                    decode_content=False,
                    retries=self.max_retries,
                    timeout=timeout
                )

            # Send the request.
            else:
                if hasattr(conn, 'proxy_pool'):
                    conn = conn.proxy_pool

                low_conn = conn._get_conn(timeout=timeout)

                try:
                    low_conn.putrequest(request.method,
                                        url,
                                        skip_accept_encoding=True)

                    for header, value in request.headers.items():
                        low_conn.putheader(header, value)

                    low_conn.endheaders()

                    for i in request.body:
                        low_conn.send(hex(len(i))[2:].encode('utf-8'))
                        low_conn.send(b'\r\n')
                        low_conn.send(i)
                        low_conn.send(b'\r\n')
                    low_conn.send(b'0\r\n\r\n')

                    r = low_conn.getresponse()
                    resp = HTTPResponse.from_httplib(
                        r,
                        pool=conn,
                        connection=low_conn,
                        preload_content=False,
                        decode_content=False
                    )
                except:
                    # If we hit any problems here, clean up the connection.
                    # Then, reraise so that we can handle the actual exception.
                    low_conn.close()
                    raise
                else:
                    # All is well, return the connection to the pool.
                    conn._put_conn(low_conn)

        except socket.error as sockerr:
            raise ConnectionError(sockerr)

        except MaxRetryError as e:
>           raise ConnectionError(e)
E           ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Ma
x retries exceeded with url: /load-data/ (Caused by <class 'socket.error'>: [Err
no 111] Connection refused)

/usr/lib/python2.7/dist-packages/requests/adapters.py:378: ConnectionError
=========================== 4 failed in 0.66 seconds ===========================
jeff1evesque commented 8 years ago

We should truncate all our tables, and verify in the database that pytest_svr_session.py is accurate. Then, we can finally verify this by checking that we can implement the corresponding generated model, on the web-interface. After the latter has been confirmed, we can proceed debugging the svm programmatic interface.

Note: we renamed pytest_svm_session.py to _pytest_svm_session.py, to ignore the corresponding svm unit test(s).

jeff1evesque commented 8 years ago

The only table from db_machine_learning not being correctly populated, during svr unit testing (fully succeeds), is tbl_observation_label.

Conversely, the only table from db_machine_learning being correctly populated during svm unit testing, is tbl_dataset_entity. The svm unit testing generates error traceback similar to the above:

...
ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Ma
x retries exceeded with url: /load-data/ (Caused by <class 'socket.error'>: [Err
no 111] Connection refused)
jeff1evesque commented 8 years ago

Our programmatic api for the svr model is functional. However, the web-interface is no longer working. We will proceed stepping through svr_json_converter.py, and ensuring the logic is performing as expected.

jeff1evesque commented 8 years ago

The web-interface for the svr dataset case is capable storing all necessary components to the sql database, except for the following case:

Note: the above statement is predicted on a vagrant up build off current master branch, then switching to feature-2586 branch, and restarting app.py.

jeff1evesque commented 8 years ago

The web-interface, and programmatic api, has been successfully tested for the svr case. Now, we can proceed testing for the svm case.

Note: the above statement is predicated on a vagrant up build of the feature-2586 branch.

jeff1evesque commented 8 years ago

The web-interface for the svm case successfully stores corresponding items into the sql database. Specifically, the following programmatic-interface json dataset(s) were used:

jeff1evesque commented 8 years ago

We renamed _pytest_svm_session.py to pytest_svm_session.py, and manually reran our unit test(s). The results are similar to an earlier stated comment. Therefore, we need to investigate only the svm case, during the programmatic api. Also, it's likely our adjustments above for the svr case, will be similar solutions for this remaining svm case.

jeff1evesque commented 8 years ago

We should debug in the following order until a solution is found:

# @dataset_to_dict.py
...
        # programmatic-interface
        elif upload['dataset']['json_string']:
            # classification
            if upload['settings']['model_type'] == 'classification':
                for dataset_json in upload['dataset']['json_string'].items():
                    # conversion
                    converter = Convert_Dataset(dataset_json, model_type, True)
                    converted = converter.json_to_dict()
                    count_features = converter.get_feature_count()

                    observation_labels.append(str(dataset_json['criterion']))

                    # build new (relevant) dataset
                    dataset.append({
                        'id_entity': id_entity,
                        'premodel_dataset': converted,
                        'count_features': count_features
                    })
...

# @convert_dataset.py
...
    def json_to_dict(self):
        '''@json_to_dict

        This method converts the supplied json file-object to a python
        dictionary.

        @self.observation_label, list containing dependent variable labels.

        '''

        # convert classification dataset
        if self.model_type == self.classification:
            data = svm_json_converter(self.raw_data, self.is_json)
...

The following needs to be built:

# @svm_json_converter.py
...
    # programmatic-interface
    else:
        dataset = raw_data
...
jeff1evesque commented 8 years ago

The unit tests correctly portray what should succeed (for the time being), and what should fail (since svr model generation is not complete):

vagrant@vagrant-ubuntu-trusty-64:/vagrant/test$ py.test
============================= test session starts ==============================

platform linux2 -- Python 2.7.6, pytest-2.9.2, py-1.4.31, pluggy-0.3.1
rootdir: /vagrant/test, inifile: pytest.ini
collected 8 items

programmatic_interface/pytest_svm_session.py ....
programmatic_interface/pytest_svr_session.py ....

=========================== 8 passed in 5.56 seconds ===========================

vagrant@vagrant-ubuntu-trusty-64:/vagrant/test$ py.test
============================= test session starts ==============================

platform linux2 -- Python 2.7.6, pytest-2.9.2, py-1.4.31, pluggy-0.3.1
rootdir: /vagrant/test, inifile: pytest.ini
collected 8 items

programmatic_interface/pytest_svm_session.py ....
programmatic_interface/pytest_svr_session.py ..FF

=================================== FAILURES ===================================

_____________________________ check_model_generate _____________________________

    def check_model_generate():
        """@check_model_generate

        This method tests the 'model_generate' session.

        """

>       assert requests.post(
            endpoint_url,
            headers=headers,
            data=get_sample_json('svr-model-generate.json', 'svr')
        )
E       assert <Response [500]>
E        +  where <Response [500]> = <function post at 0x7ffbd3ec5ed8>('http://l
ocalhost:5000/load-data/', headers={'Content-Type': 'application/json'}, data='{
"properties": {"model_type": "regression", "sv_kernel_type": "poly", "session_id
": "2", "session_type": "model_generate"}}')
E        +    where <function post at 0x7ffbd3ec5ed8> = requests.post
E        +    and   '{"properties": {"model_type": "regression", "sv_kernel_type
": "poly", "session_id": "2", "session_type": "model_generate"}}' = get_sample_j
son('svr-model-generate.json', 'svr')

programmatic_interface/pytest_svr_session.py:97: AssertionError
_____________________________ check_model_predict ______________________________

    def check_model_predict():
        """@check_model_predict

        This method tests the 'model_predict' session.

        """

>       assert requests.post(
            endpoint_url,
            headers=headers,
            data=get_sample_json('svr-model-predict.json', 'svr')
        )
E       assert <Response [500]>
E        +  where <Response [500]> = <function post at 0x7ffbd3ec5ed8>('http://l
ocalhost:5000/load-data/', headers={'Content-Type': 'application/json'}, data='{
"properties": {"model_id": "2", "prediction_input[]": ["22.22", "96.24", "338",
"72.55", "0.001", "28", "0.678"], "session_type": "model_predict"}}')
E        +    where <function post at 0x7ffbd3ec5ed8> = requests.post
E        +    and   '{"properties": {"model_id": "2", "prediction_input[]": ["22
.22", "96.24", "338", "72.55", "0.001", "28", "0.678"], "session_type": "model_p
redict"}}' = get_sample_json('svr-model-predict.json', 'svr')

programmatic_interface/pytest_svr_session.py:111: AssertionError
===================== 2 failed, 6 passed in 15.01 seconds ======================

vagrant@vagrant-ubuntu-trusty-64:/vagrant/test$