Implement nosql backend logic

jeff1evesque / machine-learning

Web-interface + rest API for classification and regression (https://jeff1evesque.github.io/machine-learning.docs)

Other

257 stars 85 forks source link

Implement nosql backend logic #2844

Closed jeff1evesque closed 7 years ago

jeff1evesque commented 7 years ago

After #2842 is resolved, we need to determine the corresponding nosql data structure, and implement it respectively with our python backend logic.

jeff1evesque commented 7 years ago

We should additionally write unit tests alongside this issue.

jeff1evesque commented 7 years ago

We need to add the pymongo driver, so python can access our mongod instance.

jeff1evesque commented 7 years ago

Instead of defining another verbose puppet/environment/vagrant/modules/package/manifests/pymongo.pp, we will focus on consolidating such an implementation, by tackling #2815.

jeff1evesque commented 7 years ago

4c84214: we generalized the installation of the pymongo driver, per https://github.com/jeff1evesque/machine-learning/issues/2844#issuecomment-292147314. If we need to be more explicit, we can easily revert to a similar implementation of 6d25045, and 5cd9c32.

jeff1evesque commented 7 years ago

We need to rewrite each of the converter methods from brain/converter/dataset.py:

json to dict: simple conversion, with no structural data manipulation
csv to dict: we need to ensure the dict is the same structure as our above json case
xml to dict: similar to above cases

This means, the corresponding import needs to be adjusted, respectively. Additionally, we'll need to redefine the get_observation_labels, and self.count_features method, such that it's count, and definition is premised around the adjusted dict object. Once, completed, the save_dataset method from brain/session/data/dataset.py will need to be adjusted. Specifically, we need to replace the following, with an implementation to store the corresponding dataset(s), into our mongodb store:

    for data in dataset:
        for select_data in data['premodel_dataset']:
            db_save = Feature({
                'premodel_dataset': select_data,
                'id_entity': data['id_entity'],
            })

            # save dataset element, append error(s)
            db_return = db_save.save_feature(model_type)
            if db_return['error']:
                list_error.append(db_return['error'])

However, we may need to encapsulate more useful information, then just the plain dataset(s). More generally, we need more parameters, which will help distinguish one dataset from another:

model type
dataset title
uid

Therefore, it is likely we need to provide more information from save_premodel_datasest, located in brain/session/base_data.py:

        # save dataset
        response = save_dataset(self.dataset, self.model_type)

jeff1evesque commented 7 years ago

442be76: we simplified our assumption by grabbing the first element in the list (i.e. first dict), and doing a len on the keys, to retrieve the feature_count. This simplified assumption, is predicated on the idea that any successive elements in the same, or similar list, will be of the same size.

jeff1evesque commented 7 years ago

We need to phase out the following mariadb tables:

tbl_observation_label
tbl_feature_count

Instead of creating an explicit sql construct, we can directly access a random node from a mongodb document, for a specified session (i.e. data_new, of data_append instance). Therefore, we can assume any mongodb document, for a given session, is properly structured, since it should only exists in the mongodb document, if it had passed an earlier collection validator, prior to database ingestion. So, we are allowed to arbitrarily choose any element from a document, with respect to a desired session instance, and obtain any information, such as a unique list of observation labels, or the feature count.

Note: the collection validator will be implemented per #2986.

jeff1evesque commented 7 years ago

c413e1b: we should proceed by verifying that the following lines correctly execute:

...
converter = Dataset(dataset, model_type, True)
converted = converter.json_to_dict()
...

jeff1evesque commented 7 years ago

9a649a6: we need to ensure that save_premodel_dataset, properly stores corresponding dataset(s), via the data_new session, into its mongodb collection.

jeff1evesque commented 7 years ago

0280bf8: we need to define the save_collection method.

jeff1evesque commented 7 years ago

Some of our recent python code, has contributed to a 502 bad gateway error:

bad-gateway

This most likely means that bad python code, is breaking our gunicorn web server(s). As a result, our reverse proxy (i.e. nginx) is unable to direct traffic. However, when we switch to the master branch, followed by vagrant destroy -f && vagrant up, the browser renders the application, as well as the POST api being able to execute requests, as expected.

Note: we needed to cleared the browser cache, to satisfy both firefox's restclient, as well as the general web-interface, when verifying that the master branch was still functional. It may have been likely, that a vagrant halt would have sufficed, rather than a full rebuild. But, this was not tested.

jeff1evesque commented 7 years ago

de3168f: we need to adjust our feature_request implementation, from sv.py.

jeff1evesque commented 7 years ago

It seems our travis ci, finds a KeyError, when trying to yaml.load the nosql configurations, from the database.yaml. So, we've replicated the corresponding statements manually:

>>> import yaml
>>> with open('database.yaml', 'r') as stream:
...   settings = yaml.load(stream)
...   sql = settings['database']['mariadb']
...   nosql = settings['database']['mongodb']
...
>>> print sql
{'username': 'authenticated', 'name': 'db_machine_learning', 'tester': 'tester', 'log_path': '/log/database', 'provisioner_password': 'password', 'host': 'localhost', 'root_password': 'password', 'provisioner': 'provisioner', 'tester_password': 'password', 'password': 'password'}
>>> print nosql
{'username': 'authenticated', 'password': 'password', 'name': 'dataset', 'storage': {'journal': {'enabled': True}, 'dbPath': ['/data', '/data/db']}, 'host': 'localhost', 'systemLog': {'verbosity': 1, 'destination': 'file', 'logAppend': True, 'systemLogPath': '/var/log/mongodb/mongod.log'}, 'net': {'bindIp': '127.0.0.1', 'port': 27017}, 'processManagement': {'fork': True, 'pidfilepath': '/var/run/mongodb/mongod.pid'}}
>>> print sql['host']
localhost
>>> print nosql['host']
localhost

Given the above traceback, it seems fair to assume that our approach is not unreasonable. Instead, we need to find out the syntax limitation, within the overall factory.py.

jeff1evesque commented 7 years ago

603b965: our above comment resulted in the mysterious KeyError, since puppet's hiera implementation, required corresponding mongodb definitions, which were properly set for the vagrant development environment, and not for the docker unit test environment, implemented by the travis ci.

jeff1evesque commented 7 years ago

Our programmatic-api currently generate a 500 error, upon the data_new session:

500-error

While our web-interface, similarly generates a 500 error:

500-error-web-interface

So, we inspected our mongodb immediately after, to discover nothing was stored:

root@trusty64:/home/vagrant# mongo
MongoDB shell version v3.4.4
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.4
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
        http://docs.mongodb.org/
Questions? Try the support group
        http://groups.google.com/group/mongodb-user
Server has startup warnings:
2017-05-23T20:57:46.162-0400 I STORAGE  [initandlisten]
2017-05-23T20:57:46.162-0400 I STORAGE  [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2017-05-23T20:57:46.162-0400 I STORAGE  [initandlisten] **          See http://dochub.mongodb.org/core/prodnotes-filesystem
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten]
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten]
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten]
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten]
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-05-23T20:57:50.093-0400 I CONTROL  [initandlisten]
> show dbs
admin  0.000GB
local  0.000GB
> show collections
> show users

This means, we'll need temporarily implement our Logger class, to determine what part of code needs to be modified. Additionally, our travis ci may not be useful, since there currently is no mongodb instance, for our pymongo connector to execute operations against, for the docker puppet environment (i.e. unit test build). This means, we'll need to create another docker container, to be used by the corresponding unit tests.

Note: it is not unlikely, that due to the size of this issue, that we may partition the corresponding unit tests (i.e. docker build for mongodb), as a separate issue.

jeff1evesque commented 7 years ago

We may need to add something of the following within mongodb/manifests/run.pp:

    ## create admin user
    exec { 'create-admin-user':
        command => dos2unix(template('mongodb/create-user.erb')),
        onlyif  => dos2unix(template('mongodb/check-user.erb')),
        notify  => Service['upstart-mongod'],
    }

This will likely entail the need to make mongod.conf.erb restart friendly. Therefore, we may need to remove the following, and somehow track the associated pid, so it can be restarted:

## restart upstart job continuously
respawn

jeff1evesque commented 7 years ago

We can use the following to test for user existence:

vagrant@trusty64:~$ TEMP=$(mongo --eval "db.getUser('admin')"); echo $TEMP
MongoDB shell version v3.4.4 connecting to: mongodb://127.0.0.1:27017 MongoDB server version: 3.4.4 { "_id" : "test.admin", "user" : "admin", "db" : "test", "customData" : { "uid" : 1 }, "roles" : [ { "role" : "clusterAdmin", "db" : "admin" }, { "role" : "readWriteAnyDatabase", "db" : "admin" }, { "role" : "userAdminAnyDatabase", "db" : "admin" }, { "role" : "dbAdminAnyDatabase", "db" : "admin" }, { "role" : "readWrite", "db" : "test" } ] }
vagrant@trusty64:~$ TEMP=$(mongo --eval "db.getUser('admin')" --quiet); echo $TEMP
{ "_id" : "test.admin", "user" : "admin", "db" : "test", "customData" : { "uid" : 1 }, "roles" : [ { "role" : "clusterAdmin", "db" : "admin" }, { "role" : "readWriteAnyDatabase", "db" : "admin" }, { "role" : "userAdminAnyDatabase", "db" : "admin" }, { "role" : "dbAdminAnyDatabase", "db" : "admin" }, { "role" : "readWrite", "db" : "test" } ] }
vagrant@trusty64:~$ TEMP=$(mongo --eval "db.getUser('adminf')" --quiet); echo $TEMP
null

Additionally, we can use the following to create users:

vagrant@trusty64:~$ TEMP=$(mongo --eval "db.createUser( { user: 'jeff1evesque', pwd: 'password', customData: { uid: 1 }, roles: [ { role: 'clusterAdmin', db: 'admin' }, { role: 'readWriteAnyDatabase', db: 'admin' }, { role: 'userAdminAnyDatabase', db: 'admin' }, { role: 'dbAdminAnyDatabase', db: 'admin' }] }, { w: 'majority' , wtimeout: 5000 } )" --quiet); echo $TEMP
Successfully added user: { "user" : "jeff1evesque", "customData" : { "uid" : 1 }, "roles" : [ { "role" : "clusterAdmin", "db" : "admin" }, { "role" : "readWriteAnyDatabase", "db" : "admin" }, { "role" : "userAdminAnyDatabase", "db" : "admin" }, { "role" : "dbAdminAnyDatabase", "db" : "admin" } ] }

So, we'll need to bootstrap the above into puppet logic. Though, its possible to use file to create an executable file, which could be implemented by puppet's exec directive, it may be better to simply write two erb templates, and execute them directly within a single exec, containing an onlyif condition.

jeff1evesque commented 7 years ago

We most likely have syntax errors in our python code, which is why our travis ci is now failing the registration unit test, and was succeeding (i.e. 28f0c82) prior to splitting the mariadb docker container, into two containers (i.e. one for mariadb, another for mongodb).

jeff1evesque commented 7 years ago

After a fresh vagrant up build, we noticed that our puppet implementation succeeded, with provisioning our mongodb authenticated user:

$ vagrant ssh
Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 4.4.0-31-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

 System information disabled due to load higher than 1.0

Last login: Thu May 25 00:47:19 2017
vagrant@trusty64:~$ TEMP=$(mongo --eval "db.getUser('admin')" --quiet); echo $TEMP
null
vagrant@trusty64:~$ TEMP=$(mongo --eval "db.getUser('authenticated')" --quiet); echo $TEMP
{ "_id" : "test.authenticated", "user" : "authenticated", "db" : "test", "roles" : [ { "role" : "clusterAdmin", "db" : "admin" }, { "role" : "readWriteAnyDatabase", "db" : "admin" }, { "role" : "userAdminAnyDatabase", "db" : "admin" }, { "role" : "dbAdminAnyDatabase", "db" : "admin" } ] }

However, our travis ci was not able to reach the same level of success. Specifically, our travis ci unit test implementation, connect failed. So, we'll need to either determine how to properly start our mongodb, or how to configure the necessary bind ip, and related settings, within the corresponding docker container.

jeff1evesque commented 7 years ago

We were able to check that the mongod port 27017 on the mongodb container, was open by using the nmap command from the redis container:

vagrant@trusty64:/vagrant/test$ sudo docker ps -a
CONTAINER ID        IMAGE                 COMMAND                  CREATED             STATUS                   PORTS               NAMES
0813f1c5ce59        container-webserver   "python app.py test"     6 hours ago         Exited (0) 6 hours ago                       webserver-pytest
30a5eb2bfd4f        container-mariadb     "/bin/sh -c mysqld"      6 hours ago         Up 6 hours                                   mariadb
dfd46b76e6c3        container-webserver   "python app.py run"      6 hours ago         Exited (1) 6 hours ago                       webserver
120f76f1c1a8        container-mongodb     "/bin/sh -c mongod..."   6 hours ago         Up 6 hours                                   mongodb
185aeb1587ab        container-redis       "/bin/sh -c redis-..."   6 hours ago         Up 6 hours                                   redis
31c41ab39585        container-default     "/bin/bash"              6 hours ago         Exited (0) 6 hours ago                       base
vagrant@trusty64:/vagrant/test$ sudo docker exec -it redis sudo nmap -p 27017 mongodb

Starting Nmap 6.40 ( http://nmap.org ) at 2017-06-02 07:48 EDT
Nmap scan report for mongodb (172.18.0.2)
Host is up (0.00014s latency).
rDNS record for 172.18.0.2: mongodb.app_nw
PORT      STATE SERVICE
27017/tcp open  unknown
MAC Address: 02:42:AC:12:00:02 (Unknown)

Nmap done: 1 IP address (1 host up) scanned in 0.47 seconds

Note: we manually installed nmap in the redis container via docker exec.

jeff1evesque commented 7 years ago

We were able to telnet from the webserver container to the mongodb container:

vagrant@trusty64:/vagrant/test$ sudo docker exec -it webserver sudo apt-get install -y telnet
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  telnet
0 upgraded, 1 newly installed, 0 to remove and 3 not upgraded.
Need to get 67.1 kB of archives.
After this operation, 167 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu/ trusty/main telnet amd64 0.17-36build2 [67.1 kB]
Fetched 67.1 kB in 0s (175 kB/s)
Selecting previously unselected package telnet.
(Reading database ... 38442 files and directories currently installed.)
Preparing to unpack .../telnet_0.17-36build2_amd64.deb ...
Unpacking telnet (0.17-36build2) ...
Setting up telnet (0.17-36build2) ...
update-alternatives: using /usr/bin/telnet.netkit to provide /usr/bin/telnet (telnet) in auto mode
vagrant@trusty64:/vagrant/test$ sudo docker exec -it webserver sudo telnet mongodb 27017
Trying 172.18.0.3...
Connected to mongodb.
Escape character is '^]'.

jeff1evesque commented 7 years ago

Additionally, we were able to ping the mongodb container from the webserver container:

vagrant@trusty64:/vagrant/test$ sudo docker exec -it webserver sudo ping mongodb
PING mongodb (172.18.0.3) 56(84) bytes of data.
64 bytes from mongodb.app_nw (172.18.0.3): icmp_seq=1 ttl=64 time=0.056 ms
64 bytes from mongodb.app_nw (172.18.0.3): icmp_seq=2 ttl=64 time=0.064 ms
64 bytes from mongodb.app_nw (172.18.0.3): icmp_seq=3 ttl=64 time=0.087 ms
64 bytes from mongodb.app_nw (172.18.0.3): icmp_seq=4 ttl=64 time=0.086 ms
64 bytes from mongodb.app_nw (172.18.0.3): icmp_seq=5 ttl=64 time=0.060 ms
^C
--- mongodb ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 3998ms
rtt min/avg/max/mdev = 0.056/0.070/0.087/0.016 ms

jeff1evesque commented 7 years ago

01ae0a7: we can start to phase out puppet, and provision the docker containers with dockerish syntax. We've verified that we were able to provision users in the docker unit test environment:

vagrant@trusty64:/vagrant/test$ ./unit-tests
...
Step 7/7 : ENTRYPOINT python app.py
 ---> Running in 4af7bae49a07
 ---> c62331dcbb1f
Removing intermediate container 4af7bae49a07
Successfully built c62331dcbb1f
7d0c5e17b58c6b1df24600d02274f3d13b52f684a3f14b6b7e3f65cfd13d365c
2f72b86e08ec3568f3ba2099d3096758d0e759bf9e7c80714aaa43f0424b2e36
2019e2afc6be804a67899284dd53372d0d1793ff9f989e5d098024041c68621a
6624afb68403813398ef8dab2e144d32b06f6e2bd6a02249c179222567a48b6b
032b70f48a1303710f8794446fc47a7f12f2362346f9cc74e6d24f16696f8d4a
765275b3221a3d64f395f719d3347f3d30696a51f06c370d5583f7c651c1c64d
Successfully added user: {
        "user" : "jeff1evesque",
        "roles" : [
                {
                        "role" : "clusterAdmin",
                        "db" : "admin"
                },
                {
                        "role" : "readWriteAnyDatabase",
                        "db" : "admin"
                },
                {
                        "role" : "userAdminAnyDatabase",
                        "db" : "admin"
                },
                {
                        "role" : "dbAdminAnyDatabase",
                        "db" : "admin"
                }
        ]
}
============================= test session starts ==============================
platform linux2 -- Python 2.7.6, pytest-3.1.2, py-1.4.34, pluggy-0.4.0
rootdir: /var/machine-learning/test/live_server, inifile: pytest.ini
plugins: flask-0.10.0, cov-2.4.0
collected 28 items
...

Note: we should attempt to figure out how to make our corresponding dockerfile(s) more dynamic, by referencing our yaml configurations.

jeff1evesque commented 7 years ago

Our above recent commits, was able to provision the jeff1evesque user within the mongodb database:

vagrant@trusty64:/vagrant/test$ ./unit-tests
...
Successfully built abcb8c09382e
e5671ea89b4f74f4b067bdf5f4fd1a15e317ed6c513154af6f4ee986f0b864d4
e91df9d4beb9362690d0bd9294e036726fbbd69c8d8a027e88e1c9a240c71c49
b98ba76c9da47b25c1d20d086c5e605758e3a13af0ebd473330e159978e77e6e
1f3197865885055d3d41518df37de2707441861d6ea75e7ddeb99f8a9e420b54
46f98dbfb3dc4549a486ccb9123b1377ef588e07d167743f80a2b7457a46112f
75bfc45c0d946c44bcea873aef88cc7253f6d97c50790244257583055536ef66
Successfully added user: {
        "user" : "jeff1evesque",
        "roles" : [
                {
                        "role" : "clusterAdmin",
                        "db" : "admin"
                },
                {
                        "role" : "readWriteAnyDatabase",
                        "db" : "admin"
                },
                {
                        "role" : "userAdminAnyDatabase",
                        "db" : "admin"
                },
                {
                        "role" : "dbAdminAnyDatabase",
                        "db" : "admin"
                }
        ]
}
============================= test session starts ==============================
...
vagrant@trusty64:/vagrant/test$ sudo docker exec -it mongodb sudo netstat -ntlup
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.11:39459        0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:27017           0.0.0.0:*               LISTEN      1/mongod
udp        0      0 127.0.0.11:58971        0.0.0.0:*                           -
vagrant@trusty64:/vagrant/test$ sudo docker ps -a
CONTAINER ID        IMAGE                 COMMAND                  CREATED             STATUS                      PORTS               NAMES
18340bef3d53        container-webserver   "python app.py test"     15 minutes ago      Exited (0) 13 minutes ago                       webserver-pytest
f701e6571918        container-mariadb     "/bin/sh -c mysqld"      15 minutes ago      Up 15 minutes                                   mariadb
1c4f94c7e34a        container-webserver   "python app.py run"      15 minutes ago      Up 15 minutes                                   webserver
05f710501f10        container-mongodb     "/usr/bin/mongod -..."   15 minutes ago      Up 15 minutes                                   mongodb
11926a267118        container-redis       "/bin/sh -c redis-..."   15 minutes ago      Up 15 minutes                                   redis
0c368b4405b0        container-default     "/bin/bash"              15 minutes ago      Exited (0) 15 minutes ago                       base
vagrant@trusty64:/vagrant/test$ sudo docker exec -it mongodb mongo --eval 'db.getUsers()'
MongoDB shell version: 3.2.14
connecting to: test
[
        {
                "_id" : "test.jeff1evesque",
                "user" : "jeff1evesque",
                "db" : "test",
                "roles" : [
                        {
                                "role" : "clusterAdmin",
                                "db" : "admin"
                        },
                        {
                                "role" : "readWriteAnyDatabase",
                                "db" : "admin"
                        },
                        {
                                "role" : "userAdminAnyDatabase",
                                "db" : "admin"
                        },
                        {
                                "role" : "dbAdminAnyDatabase",
                                "db" : "admin"
                        }
                ]
        }
]

jeff1evesque commented 7 years ago

231cfb4: we'll need to restart the mongod process, after we createUser, for both the vagrant build, as well as the docker unit test build.

jeff1evesque commented 7 years ago

We temporarily amended (not committed) our unit-tests with the following:

...
## provision mongodb authorization
sudo docker exec -it mongodb sudo mongo mongodb://mongodb:27017 --eval "db.getSiblingDB('admin'); db.createUser({\
    user: 'jeff1evesque',\
    pwd: 'password',\
    roles: [\
        {role: 'clusterAdmin', db: 'admin' },\
        {role: 'readWriteAnyDatabase', db: 'admin' },\
        {role: 'userAdminAnyDatabase', db: 'admin' },\
        {role: 'dbAdminAnyDatabase', db: 'admin' }\
]},\
{ w: 'majority' , wtimeout: 5000 } )" --quiet

sudo docker exec -it mongodb sudo cat /etc/mongod.conf
sudo docker exec -it mongodb sudo ps -eo pid,cmd,lstart
echo '================================================='
sudo docker exec -it mongodb sudo sed -i "/#[[:space:]]*security:/s/^#//g" /etc/mongod.conf
sudo docker exec -it mongodb sudo sed -i "/#[[:space:]]*authorization:[[:space:]]*enabled/s/^#//g" /etc/mongod.conf
echo '================================================='
sudo docker exec -it mongodb sudo cat /etc/mongod.conf
sudo docker exec -it mongodb sudo ps -eo pid,cmd,lstart
sudo docker restart mongodb
echo '================================================='
sudo docker exec -it mongodb sudo cat /etc/mongod.conf
sudo docker exec -it mongodb sudo ps -eo pid,cmd,lstart
...

Upon running the ./unit-tests in our vagrant environment:

vagrant@trusty64:/vagrant/test$ ./unit-tests
...
Successfully added user: {
        "user" : "jeff1evesque",
        "roles" : [
                {
                        "role" : "clusterAdmin",
                        "db" : "admin"
                },
                {
                        "role" : "readWriteAnyDatabase",
                        "db" : "admin"
                },
                {
                        "role" : "userAdminAnyDatabase",
                        "db" : "admin"
                },
                {
                        "role" : "dbAdminAnyDatabase",
                        "db" : "admin"
                }
        ]
}
## mongodb.conf, this file is enforced by puppet.
##
## Note: http://docs.mongodb.org/manual/reference/configuration-options/
##

## where and how to store data.
storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true

## where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

## network interfaces
net:
  port: 27017
  bindIp: 0.0.0.0

## role-based access controls
#security:
#   authorization: enabled
  PID CMD                                          STARTED
    1 /usr/bin/mongod -f /etc/mon Mon Jun 19 00:00:37 2017
   34 sudo ps -eo pid,cmd,lstart  Mon Jun 19 00:00:41 2017
   37 ps -eo pid,cmd,lstart       Mon Jun 19 00:00:41 2017
=================================================
=================================================
## mongodb.conf, this file is enforced by puppet.
##
## Note: http://docs.mongodb.org/manual/reference/configuration-options/
##

## where and how to store data.
storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true

## where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

## network interfaces
net:
  port: 27017
  bindIp: 0.0.0.0

## role-based access controls
security:
   authorization: enabled
  PID CMD                                          STARTED
    1 /usr/bin/mongod -f /etc/mon Mon Jun 19 00:00:37 2017
   53 sudo ps -eo pid,cmd,lstart  Mon Jun 19 00:00:41 2017
   57 ps -eo pid,cmd,lstart       Mon Jun 19 00:00:41 2017
mongodb
=================================================
## mongodb.conf, this file is enforced by puppet.
##
## Note: http://docs.mongodb.org/manual/reference/configuration-options/
##

## where and how to store data.
storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true

## where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

## network interfaces
net:
  port: 27017
  bindIp: 0.0.0.0

## role-based access controls
security:
   authorization: enabled
  PID CMD                                          STARTED
    1 /usr/bin/mongod -f /etc/mon Mon Jun 19 00:00:42 2017
   11 sudo ps -eo pid,cmd,lstart  Mon Jun 19 00:00:43 2017
   15 ps -eo pid,cmd,lstart       Mon Jun 19 00:00:43 2017
...

We notice the mongod start time, for the corresponding pid changed. So, we'll need to check if our pymongo implementation properly authenticates to the mongod process.

jeff1evesque commented 7 years ago

We were able to connect to our mongodb via the mongo shell command:

vagrant@trusty64:/vagrant/test$ sudo docker exec -it mongodb mongo --port 27017 -u authenticated -p password
MongoDB shell version: 3.2.14
connecting to: 127.0.0.1:27017/test
Server has startup warnings:
2017-06-20T08:20:10.516-0400 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2017-06-20T08:20:10.516-0400 I CONTROL  [initandlisten]
2017-06-20T08:20:10.516-0400 I CONTROL  [initandlisten]
2017-06-20T08:20:10.516-0400 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2017-06-20T08:20:10.517-0400 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-06-20T08:20:10.517-0400 I CONTROL  [initandlisten]
2017-06-20T08:20:10.517-0400 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2017-06-20T08:20:10.517-0400 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-06-20T08:20:10.517-0400 I CONTROL  [initandlisten]

However, the corresponding snippet from query.py:

            # single mongodb instance
            self.client = MongoClient(
                "mongodb://{user}:{pass}@{host}/admin".format(**self.args)
            )
            self.database = self.client[self.args['db']]
            self.collection = self.database[collection]

generates the following errors, contained within /var/log/mongodb/mongod.log:

2017-06-20T08:22:01.782-0400 I ACCESS   [conn8] SCRAM-SHA-1 authentication failed for authenticated on admin from client 172.18.0.6 ; UserNotFound: Could not find user authenticated@admin

jeff1evesque commented 7 years ago

We created a corresponding question in the stackoverflow forum, and will proceed next by ensuring the /var/run/mongod.pid file, is defined in the /etc/mongod.conf configuration.

jeff1evesque commented 7 years ago

Our manual unit tests, now has a little more success:

root@trusty64:/vagrant/test# ./unit-tests
...
============================= test session starts ==============================
platform linux2 -- Python 2.7.6, pytest-3.1.2, py-1.4.34, pluggy-0.4.0
rootdir: /var/machine-learning/test/live_server, inifile: pytest.ini
plugins: flask-0.10.0, cov-2.4.0
collected 28 items

test/live_server/authentication/pytest_account_registration.py .
test/live_server/authentication/pytest_crypto.py .
test/live_server/authentication/pytest_user_login.py .
test/live_server/authentication/pytest_user_logout.py .
test/live_server/authentication/pytest_validate_password.py .
test/live_server/programmatic_interface/dataset_url/pytest_svm_dataset_url.py ..FF
test/live_server/programmatic_interface/dataset_url/pytest_svr_dataset_url.py ..FF
test/live_server/programmatic_interface/file_upload/pytest_svm_file_upload.py ..FF
test/live_server/programmatic_interface/file_upload/pytest_svr_file_upload.py FFFF
test/live_server/programmatic_interface/results/pytest_1_svm_prediction.py ...
test/live_server/programmatic_interface/results/pytest_2_svr_prediction.py ...
test/live_server/programmatic_interface/results/pytest_3_all_prediction_titles.py .
...
self = SocketInfo(<socket._socketobject object at 0x7f52d74a68a0>) CLOSED at 139993749029008
error = InvalidDocument("key '31.111' must not contain '.'",)

    def _raise_connection_failure(self, error):
        # Catch *all* exceptions from socket methods and close the socket. In
        # regular Python, socket operations only raise socket.error, even if
        # the underlying cause was a Ctrl-C: a signal raised during socket.recv
        # is expressed as an EINTR error from poll. See internal_select_ex() in
        # socketmodule.c. All error codes from poll become socket.error at
        # first. Eventually in PyEval_EvalFrameEx the interpreter checks for
        # signals and throws KeyboardInterrupt into the current frame on the
        # main thread.
        #
        # But in Gevent and Eventlet, the polling mechanism (epoll, kqueue,
        # ...) is called in Python code, which experiences the signal as a
        # KeyboardInterrupt from the start, rather than as an initial
        # socket.error, so we catch that, close the socket, and reraise it.
        self.close()
        if isinstance(error, socket.error):
            _raise_connection_failure(self.address, error)
        else:
>           raise error
E           InvalidDocument: key '31.111' must not contain '.'

/usr/local/lib/python2.7/dist-packages/pymongo/pool.py:552: InvalidDocument

However, the above traceback indicates that we'll need to either restructure our dataset(s), or create a mechanism allowing massaged data to be stored. Additionally, all cases of the model_generate, as well as the model_predict sessions, will need to be reworked.

jeff1evesque commented 7 years ago

Additionally, we have verified that our insert equivalent commands are storing data:

vagrant@trusty64:/vagrant/test$ sudo docker exec -it mongodb mongo admin --port 27017 -u authenticated -p password
MongoDB shell version: 3.2.14
connecting to: 127.0.0.1:27017/admin
> use dataset
switched to db dataset
> show collections
supervised.posts
> var collections = db.getCollectionNames();
> for (var i = 0; i< collections.length; i++) {  print('Collection: ' + collections[i]); db.getCollection(collections[i]).find().forEach(printjson); }
Collection: supervised.posts
{
        "_id" : ObjectId("595044e60a50bc00010645f6"),
        "data" : {
                "dataset" : {
                        "file_upload" : null,
                        "json_string" : [
                                "https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svm.json",
                                "https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svm-1.json"
                        ],
                        "upload_quantity" : 1
                },
                "settings" : {
                        "model_type" : "svm",
                        "session_name" : "sample_svm_title",
                        "dataset_type" : "dataset_url",
                        "session_type" : "data_new"
                }
        },
        "error" : null
}
{
        "_id" : ObjectId("595044e80a50bc00010645f8"),
        "data" : {
                "dataset" : {
                        "file_upload" : null,
                        "json_string" : [
                                "https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svm.json",
                                "https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svm-1.json"
                        ],
                        "upload_quantity" : 1
                },
                "settings" : {
                        "model_type" : "svm",
                        "dataset_type" : "dataset_url",
                        "session_id" : "1",
                        "session_type" : "data_append"
                }
        },
        "error" : null
}
{
        "_id" : ObjectId("595044f20a50bc00010645fa"),
        "data" : {
                "dataset" : {
                        "file_upload" : null,
                        "json_string" : [
                                "https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svr.json",
                                "https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svr-1.json"
                        ],
                        "upload_quantity" : 1
                },
                "settings" : {
                        "model_type" : "svr",
                        "session_name" : "sample_svr_title",
                        "dataset_type" : "dataset_url",
                        "session_type" : "data_new"
                }
        },
        "error" : null
}
{
        "_id" : ObjectId("595044f40a50bc00010645fc"),
        "data" : {
                "dataset" : {
                        "file_upload" : null,
                        "json_string" : [
                                "https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svr.json",
                                "https://raw.githubusercontent.com/jeff1evesque/machine-learning/master/interface/static/data/json/web_interface/svr-1.json"
                        ],
                        "upload_quantity" : 1
                },
                "settings" : {
                        "model_type" : "svr",
                        "dataset_type" : "dataset_url",
                        "session_id" : "2",
                        "session_type" : "data_append"
                }
        },
        "error" : null
}
{
        "_id" : ObjectId("595044fd0a50bc00010645fe"),
        "data" : {
                "dataset" : {
                        "file_upload" : null,
                        "json_string" : {
                                "dep-variable-5" : [
                                        {
                                                "indep-variable-6" : 0.001,
                                                "indep-variable-7" : 27,
                                                "indep-variable-4" : 295,
                                                "indep-variable-5" : 55.83,
                                                "indep-variable-2" : 95.03,
                                                "indep-variable-3" : 0.488,
                                                "indep-variable-1" : 23.27
                                        },
                                        {
                                                "indep-variable-6" : 0.001,
                                                "indep-variable-7" : 27,
                                                "indep-variable-4" : 295,
                                                "indep-variable-5" : 55.83,
                                                "indep-variable-2" : 95.03,
                                                "indep-variable-3" : 0.488,
                                                "indep-variable-1" : 23.27
                                        },
                                        {
                                                "indep-variable-6" : 0.001,
                                                "indep-variable-7" : 29,
                                                "indep-variable-4" : 303,
                                                "indep-variable-5" : 58.88,
                                                "indep-variable-2" : 97.78,
                                                "indep-variable-3" : 0.638,
                                                "indep-variable-1" : 19.99
                                        }
                                ],
                                "dep-variable-4" : [
                                        {
                                                "indep-variable-6" : 0.001,
                                                "indep-variable-7" : 32,
                                                "indep-variable-4" : 342,
                                                "indep-variable-5" : 66.67,
                                                "indep-variable-2" : 95.96,
                                                "indep-variable-3" : 0.743,
                                                "indep-variable-1" : 22.1
                                        },
                                        {
                                                "indep-variable-6" : 0.001,
                                                "indep-variable-7" : 30,
                                                "indep-variable-4" : 342,
                                                "indep-variable-5" : 75.67,
                                                "indep-variable-2" : 99.33,
                                                "indep-variable-3" : 0.648,
                                                "indep-variable-1" : 20.71
                                        }
                                ],
                                "dep-variable-1" : [
                                        {
                                                "indep-variable-6" : 0.002,
                                                "indep-variable-7" : 23,
                                                "indep-variable-4" : 325,
                                                "indep-variable-5" : 54.64,
                                                "indep-variable-2" : 98.01,
                                                "indep-variable-3" : 0.432,
                                                "indep-variable-1" : 23.45
                                        }
                                ],
                                "dep-variable-3" : {
                                        "indep-variable-6" : 0.002,
                                        "indep-variable-7" : 26,
                                        "indep-variable-4" : 427,
                                        "indep-variable-5" : 75.45,
                                        "indep-variable-2" : 101.21,
                                        "indep-variable-3" : 0.832,
                                        "indep-variable-1" : 22.67
                                }
                        },
                        "upload_quantity" : 1
                },
                "settings" : {
                        "model_type" : "svm",
                        "session_name" : "sample_svm_title",
                        "dataset_type" : "file_upload",
                        "session_type" : "data_new"
                }
        },
        "error" : null
}
{
        "_id" : ObjectId("595044ff0a50bc0001064600"),
        "data" : {
                "dataset" : {
                        "file_upload" : null,
                        "json_string" : {
                                "dep-variable-1" : [
                                        {
                                                "indep-variable-6" : 0.002,
                                                "indep-variable-7" : 25,
                                                "indep-variable-4" : 325,
                                                "indep-variable-5" : 54.64,
                                                "indep-variable-2" : 98.01,
                                                "indep-variable-3" : 0.432,
                                                "indep-variable-1" : 23.45
                                        }
                                ],
                                "dep-variable-3" : [
                                        {
                                                "indep-variable-6" : 0.002,
                                                "indep-variable-7" : 24,
                                                "indep-variable-4" : 427,
                                                "indep-variable-5" : 75.45,
                                                "indep-variable-2" : 101.21,
                                                "indep-variable-3" : 0.832,
                                                "indep-variable-1" : 22.67
                                        }
                                ],
                                "dep-variable-2" : [
                                        {
                                                "indep-variable-6" : 0.001,
                                                "indep-variable-7" : 31,
                                                "indep-variable-4" : 235,
                                                "indep-variable-5" : 64.45,
                                                "indep-variable-2" : 92.22,
                                                "indep-variable-3" : 0.356,
                                                "indep-variable-1" : 24.32
                                        }
                                ]
                        },
                        "upload_quantity" : 1
                },
                "settings" : {
                        "model_type" : "svm",
                        "dataset_type" : "file_upload",
                        "session_id" : "3",
                        "session_type" : "data_append"
                }
        },
        "error" : null
}

jeff1evesque commented 7 years ago

We need to restructure all our sample dataset(s), by ensuring no key values contain actual values, as the case with the above 31.111, on an svr data_new session. So, we'll need to make these adjustments for both classification, and regression based calculations, and make necessary adjustments to our documentation.

jeff1evesque commented 7 years ago

e179784: it is likely that the application will need to have many database writes. So, it seems more logical to use a single connection to serve this purpose, instead of continuously opening, and closing connections. Having to continuously open, and close connections, would be expensive on system resources.

Additionally, we'll need to reconsider the need for the data_append session. Since we are refactoring with nosql, we will likely consider enforcing a particular key value, in the json file, which binds all json files, to be collectively used, to generate a corresponding model. This seems like a better solution, than relying on a single json file, which can be appended to an infinite number of times. Specifically, mongodb by default, has about a 16MB limit, for a given json file. This means, we'll likely take away the data_append session, since it will not be compatible with our nosql implementation - especially, if later it is distributed.

jeff1evesque commented 7 years ago

We should readjust our flask variable implementation with flask's built-in database connection management:

jeff1evesque commented 7 years ago

Our earlier comment is not fully accurate. Specifically, each particular study, will contain it's own mongodb collection, within the mongodb database. This means if some sensor1 is responsible for collecting data, on a determined interval, to be used for a defined computation(s), then each successive time the device streams data, it will store the corresponding json documents, in the same database collection. So, the mongodb collection(s) will partition each study, by grouping corresponding documents into collections. Additionally, all collections will be contained within the same overall database. This will allow one study to leverage json documents, from another study, if permissions have been properly granted.

jeff1evesque commented 7 years ago

ce76476: we need to define a corresponding form element, to capture the collection information. This means, we'll need to adjust our corresponding jsx presentation.

jeff1evesque commented 7 years ago

We need to rework, and possibly reduce the following views.py logic, for the web-interface:

...
        # web-interface: get submitted form data
        if request.form:
            settings = request.form
            sender = Settings(settings, files)
            data_formatted = sender.restructure()

            # send reformatted data to brain
            loader = Load_Data(data_formatted)
...

Specifically, the argument supplied to the Load_Data class, should be of the same structure, for both interfaces. This means, we'll need to note of the structure supplied for the programmatic-interface, and make the corresponding adjustments, for the web-interface equivalent.

jeff1evesque commented 7 years ago

85e6d2b: we were able to verify that our list comprehension is implemented as expected:

vagrant@trusty64:~$ python
Python 2.7.6 (default, Oct 26 2016, 20:30:19) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> myDict = {'b': 3, 'a': 5, 'c': 1}
>>> [v for k, v in sorted(myDict.items())]
[5, 3, 1]
>>> [k for k, v in sorted(myDict.items())]
['a', 'b', 'c']

jeff1evesque commented 7 years ago

We need to determine how to properly adjust, our label encoder from sv.py:

    # generate svm model
    if model == list_model_type[0]:
        # convert observation labels to a unique integer representation
        label_encoder = preprocessing.LabelEncoder()
        label_encoder.fit(dataset[:, 0])
        encoded_labels = label_encoder.transform(observation_labels)

jeff1evesque commented 7 years ago

f0d3ed5: we implemented the LabelEncoder, using the following, to ensure unique label fitting:

vagrant@trusty64:~$ python
Python 2.7.6 (default, Oct 26 2016, 20:30:19) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> list(set(t))
[1, 2, 3, 5, 6, 7, 8]

jeff1evesque commented 7 years ago

The following will verify if the authenticated user can authenticate against on the admin table:

mongo admin --port 27017 -u authenticated -p password

jeff1evesque commented 7 years ago

5581017: dataset['result'] will contain the <pymongo.cursor.Cursor object at 0x7feaf25b5390> object, within our sv.py. So, we'll need to consider using mongo's aggregation implementation. Later, if distributed clustering is something of interest, we could implement either aggregation pipelines, or consider integrating hadoop with mongodb.

jeff1evesque commented 7 years ago

We'll attempt the built in mongodb aggregate implementation, such that array fields are merged. However, we may need to replace the addToSet with the push operator, since the former ignores adding duplicate items to the array, while the latter will duplicates to be added to the array.

jeff1evesque commented 7 years ago

The following sv.py snippet:

...
    # restructure dataset into arrays
    observation_labels = []
    grouped_features = []

    for dataset in datasets['result']:
        logger = Logger(__name__, 'error', 'error')
        logger.log('sv.py, dataset: ' + repr(dataset))

        for observation in dataset['dataset']:
            logger.log('sv.py, observation: ' + repr(observation))
            observation_labels.append(observation['dependent-variable'])

            indep_variables = observation['independent-variables']
            logger.log('sv.py, indep_variables: ' + repr(indep_variables))

            for features in indep_variables:
                sorted_features = [v for k, v in sorted(features.items())]
                grouped_features.append(sorted_features)
                logger.log('sv.py, grouped_features: ' + repr(grouped_features))

                if not sorted_labels:
                    sorted_labels = [k for k, v in sorted(features.items())]
                    logger.log('sv.py, sorted_labels: ' + repr(sorted_labels))

    # generate svm model
...

generates an error.log, for the web-interface:

[2017-07-19 08:16:39,037] {/vagrant/log/logger.py:165} DEBUG - brain.database.dataset: brain/database/dataset.py, collection: u'test-756'
[2017-07-19 08:16:39,037] {/vagrant/log/logger.py:165} DEBUG - brain.database.dataset: brain/database/dataset.py, operation: 'aggregate'
[2017-07-19 08:16:39,037] {/vagrant/log/logger.py:165} DEBUG - brain.database.dataset: brain/database/dataset.py, payload: [{'$project': {'dataset': 1}}]
[2017-07-19 08:16:39,152] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, dataset: {u'_id': ObjectId('596f48df9bd56c083a84bec0'), u'dataset': [{u'dependent-variable': u'dep-variable-1', u'independent-variables': [{u'indep-variable-6': 0.002, u'indep-variable-7': 25, u'indep-variable-4': 325, u'indep-variable-5': 54.64, u'indep-variable-2': 98.01, u'indep-variable-3': 0.432, u'indep-variable-1': 23.45}]}, {u'dependent-variable': u'dep-variable-2', u'independent-variables': [{u'indep-variable-6': 0.001, u'indep-variable-7': 31, u'indep-variable-4': 235, u'indep-variable-5': 64.45, u'indep-variable-2': 92.22, u'indep-variable-3': 0.356, u'indep-variable-1': 24.32}]}, {u'dependent-variable': u'dep-variable-3', u'independent-variables': [{u'indep-variable-6': 0.002, u'indep-variable-7': 24, u'indep-variable-4': 427, u'indep-variable-5': 75.45, u'indep-variable-2': 101.21, u'indep-variable-3': 0.832, u'indep-variable-1': 22.67}]}]}
[2017-07-19 08:16:39,152] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, observation: {u'dependent-variable': u'dep-variable-1', u'independent-variables': [{u'indep-variable-6': 0.002, u'indep-variable-7': 25, u'indep-variable-4': 325, u'indep-variable-5': 54.64, u'indep-variable-2': 98.01, u'indep-variable-3': 0.432, u'indep-variable-1': 23.45}]}
[2017-07-19 08:16:39,152] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, indep_variables: [{u'indep-variable-6': 0.002, u'indep-variable-7': 25, u'indep-variable-4': 325, u'indep-variable-5': 54.64, u'indep-variable-2': 98.01, u'indep-variable-3': 0.432, u'indep-variable-1': 23.45}]
[2017-07-19 08:16:39,153] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, grouped_features: [[23.45, 98.01, 0.432, 325, 54.64, 0.002, 25]]
[2017-07-19 08:16:39,153] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, sorted_labels: [u'indep-variable-1', u'indep-variable-2', u'indep-variable-3', u'indep-variable-4', u'indep-variable-5', u'indep-variable-6', u'indep-variable-7']
[2017-07-19 08:16:39,153] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, observation: {u'dependent-variable': u'dep-variable-2', u'independent-variables': [{u'indep-variable-6': 0.001, u'indep-variable-7': 31, u'indep-variable-4': 235, u'indep-variable-5': 64.45, u'indep-variable-2': 92.22, u'indep-variable-3': 0.356, u'indep-variable-1': 24.32}]}
[2017-07-19 08:16:39,154] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, indep_variables: [{u'indep-variable-6': 0.001, u'indep-variable-7': 31, u'indep-variable-4': 235, u'indep-variable-5': 64.45, u'indep-variable-2': 92.22, u'indep-variable-3': 0.356, u'indep-variable-1': 24.32}]
[2017-07-19 08:16:39,154] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, grouped_features: [[23.45, 98.01, 0.432, 325, 54.64, 0.002, 25], [24.32, 92.22, 0.356, 235, 64.45, 0.001, 31]]
[2017-07-19 08:16:39,154] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, observation: {u'dependent-variable': u'dep-variable-3', u'independent-variables': [{u'indep-variable-6': 0.002, u'indep-variable-7': 24, u'indep-variable-4': 427, u'indep-variable-5': 75.45, u'indep-variable-2': 101.21, u'indep-variable-3': 0.832, u'indep-variable-1': 22.67}]}
[2017-07-19 08:16:39,155] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, indep_variables: [{u'indep-variable-6': 0.002, u'indep-variable-7': 24, u'indep-variable-4': 427, u'indep-variable-5': 75.45, u'indep-variable-2': 101.21, u'indep-variable-3': 0.832, u'indep-variable-1': 22.67}]
[2017-07-19 08:16:39,156] {/vagrant/log/logger.py:165} DEBUG - brain.session.model.sv: sv.py, grouped_features: [[23.45, 98.01, 0.432, 325, 54.64, 0.002, 25], [24.32, 92.22, 0.356, 235, 64.45, 0.001, 31], [22.67, 101.21, 0.832, 427, 75.45, 0.002, 24]]
[2017-07-19 08:16:39,164] {/vagrant/log/logger.py:165} DEBUG - brain.load_data: load_data.py, response: {'status': 0, 'msg': 'Model properly generated', 'type': 'model-generate'}

Note: the above snippet implemented the svm model-type, during the data_new session.

jeff1evesque commented 7 years ago

The following csv2dict.py snippet:

...
    logger = Logger(__name__, 'error', 'error')

    # open temporary 'csvfile' reader object
    dataset_reader = csv.reader(
        raw_data,
        delimiter=' ',
        quotechar='|'
    )

    # first row of csvfile: get all columns, except first
    for row in islice(dataset_reader, 0, 1):
        indep_labels_list = row[0].split(',')[1:]

    # all rows of csvfile: except first row
    for dep_index, row in enumerate(islice(dataset_reader, 0, None)):
        row_arr = row[0].split(',')
        features_list = row_arr[1:]
        features_dict = {k: v for k, v in zip(indep_labels_list, features_list)}

        observation = {
            'dependent-variable': row_arr[:1][0],
            'independent-variables': [features_dict]
        }

        dataset.append(observation)

    logger.log('/brain/converter/svm/csvtodict.py, dataset: ' + repr(dataset))
...

generates an error.log, for the web-interface:

[2017-07-21 08:02:06,168] {/vagrant/log/logger.py:165} DEBUG - brain.converter.svm.csv2dict: /brain/converter/svm/csvtodict.py, dataset: [{'dependent-variable': 'dep-variable-1', 'independent-variables': [{'indep-variable-6': '0.002', 'indep-variable-7': '23', 'indep-variable-4': '325', 'indep-variable-5': '54.64', 'indep-variable-2': '98.01', 'indep-variable-3': '0.432', 'indep-variable-1': '23.45'}]}, {'dependent-variable': 'dep-variable-4', 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '32', 'indep-variable-4': '342', 'indep-variable-5': '66.67', 'indep-variable-2': '95.96', 'indep-variable-3': '0.743', 'indep-variable-1': '22.1'}]}, {'dependent-variable': 'dep-variable-5', 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '27', 'indep-variable-4': '295', 'indep-variable-5': '55.83', 'indep-variable-2': '95.03', 'indep-variable-3': '0.488', 'indep-variable-1': '23.27'}]}, {'dependent-variable': 'dep-variable-3', 'independent-variables': [{'indep-variable-6': '0.002', 'indep-variable-7': '26', 'indep-variable-4': '427', 'indep-variable-5': '75.45', 'indep-variable-2': '101.21', 'indep-variable-3': '0.832', 'indep-variable-1': '22.67'}]}, {'dependent-variable': 'dep-variable-5', 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '29', 'indep-variable-4': '303', 'indep-variable-5': '58.88', 'indep-variable-2': '97.78', 'indep-variable-3': '0.638', 'indep-variable-1': '19.99'}]}, {'dependent-variable': 'dep-variable-5', 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '27', 'indep-variable-4': '295', 'indep-variable-5': '55.83', 'indep-variable-2': '95.03', 'indep-variable-3': '0.488', 'indep-variable-1': '23.27'}]}, {'dependent-variable': 'dep-variable-4', 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '30', 'indep-variable-4': '342', 'indep-variable-5': '75.67', 'indep-variable-2': '99.33', 'indep-variable-3': '0.648', 'indep-variable-1': '20.71'}]}]

Note: the above snippet implemented the svm model-type, during the data_new session.

jeff1evesque commented 7 years ago

The following xml2dict.py snippet:

    # open temporary 'xmltodict' object
    dataset = []
    dataset_reader = xmltodict.parse(raw_data)
    logger = Logger(__name__, 'error', 'error')

    # build dataset
    for observation in dataset_reader['dataset']['observation']:
        features_dict = {}
        dependent_variable = observation['dependent-variable']
        for feature in observation['independent-variable']:
            features_dict[feature['label']] = feature['value']

        adjusted = {
            'dependent-variable': dependent_variable,
            'independent-variables': [features_dict]
        }

        dataset.append(adjusted)

    logger.log('/brain/converter/format/xml2dict.py, dataset: ' + repr(dataset))

generates an error.log, for the web-interface:

2017-07-22 15:58:34,624] {/vagrant/log/logger.py:165} DEBUG - brain.session.base_data: /brain/session/base_data.py, self.dataset: {'properties': {'stream': False, 'session_type': u'data_new', 'collection': u'collection-358', 'dataset_type': u'file_upload', 'model_type': u'svm', 'session_name': u'test'}, 'dataset': [{'dependent-variable': u'dep-variable-1', 'independent-variables': [{u'indep-variable-6': u'0.002', u'indep-variable-7': u'23', u'indep-variable-4': u'325', u'indep-variable-5': u'56.64', u'indep-variable-2': u'98.01', u'indep-variable-3': u'0.432', u'indep-variable-1': u'23.45'}]}, {'dependent-variable': u'dep-variable-4', 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'32', u'indep-variable-4': u'342', u'indep-variable-5': u'66.67', u'indep-variable-2': u'95.96', u'indep-variable-3': u'0.743', u'indep-variable-1': u'22.1'}]}, {'dependent-variable': u'dep-variable-5', 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'27', u'indep-variable-4': u'295', u'indep-variable-5': u'55.83', u'indep-variable-2': u'95.03', u'indep-variable-3': u'0.488', u'indep-variable-1': u'23.27'}]}, {'dependent-variable': u'dep-variable-3', 'independent-variables': [{u'indep-variable-6': u'0.002', u'indep-variable-7': u'26', u'indep-variable-4': u'427', u'indep-variable-5': u'75.45', u'indep-variable-2': u'101.21', u'indep-variable-3': u'0.832', u'indep-variable-1': u'22.67'}]}, {'dependent-variable': u'dep-variable-5', 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'29', u'indep-variable-4': u'303', u'indep-variable-5': u'58.88', u'indep-variable-2': u'97.78', u'indep-variable-3': u'0.638', u'indep-variable-1': u'19.99'}]}, {'dependent-variable': u'dep-variable-1', 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'27', u'indep-variable-4': u'295', u'indep-variable-5': u'55.83', u'indep-variable-2': u'95.03', u'indep-variable-3': u'0.488', u'indep-variable-1': u'23.27'}]}, {'dependent-variable': u'dep-variable-1', 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'30', u'indep-variable-4': u'342', u'indep-variable-5': u'75.67', u'indep-variable-2': u'99.33', u'indep-variable-3': u'0.648', u'indep-variable-1': u'20.71'}]}, {'dependent-variable': u'dep-variable-1', 'independent-variables': [{u'indep-variable-6': u'0.002', u'indep-variable-7': u'25', u'indep-variable-4': u'325', u'indep-variable-5': u'54.64', u'indep-variable-2': u'98.01', u'indep-variable-3': u'0.432', u'indep-variable-1': u'23.45'}]}, {'dependent-variable': u'dep-variable-2', 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'31', u'indep-variable-4': u'235', u'indep-variable-5': u'64.45', u'indep-variable-2': u'92.22', u'indep-variable-3': u'0.356', u'indep-variable-1': u'24.32'}]}, {'dependent-variable': u'dep-variable-3', 'independent-variables': [{u'indep-variable-6': u'0.002', u'indep-variable-7': u'24', u'indep-variable-4': u'427', u'indep-variable-5': u'75.45', u'indep-variable-2': u'101.21', u'indep-variable-3': u'0.832', u'indep-variable-1': u'22.67'}]}]}

Note: the above snippet implemented the svm model-type, during the data_new session.

Note: the above error.log snippet was considerably longer, since both the svm.xml, and svm-1.xml was used during the data_new session.

jeff1evesque commented 7 years ago

0c2d225: our previous commit 4cfa9f0 (from another computer), had wiped out from history, our commits for the last couple days, by forcing a merge from the master, into feature-2844. So, on the original machine (used this weekend), we were able to recover history, by trivially pushing the most previous commit (i.e. 98d98d8), prior to the accidental merge, via git commit --amend -m "#2844: ...".

jeff1evesque commented 7 years ago

We have temporarily ensured the following snippet in our base_data.py:

...
    def save_premodel_dataset(self):
        '''

        This method saves the entire the dataset collection, as a json
        document, into the nosql implementation.

        '''

        # save dataset
        collection = self.premodel_data['properties']['collection']
        collection_adjusted = collection.lower().replace(' ', '_')
        cursor = Collection()
        document = {'properties': self.premodel_data['properties'], 'dataset': self.dataset}

        logger = Logger(__name__, 'error', 'error')
        logger.log('/brain/session/base_data.py, self.dataset: ' + repr(document))
...

Upon a fresh data_new session, with the following input datasets:

multiple-input

We noticed the following within our error.log:

2017-07-24 18:21:30,397] {/vagrant/log/logger.py:165} DEBUG - brain.session.base_data: /brain/session/base_data.py, self.dataset: {'properties': {'stream': False, 'session_type': u'data_new', 'collection': u'collection-621', 'dataset_type': u'file_upload', 'model_type': u'svm', 'session_name': u'test'}, 'dataset': [{'dependent-variable': 'dep-variable-1', 'error': None, 'independent-variables': [{'indep-variable-6': '0.002', 'indep-variable-7': '23', 'indep-variable-4': '325', 'indep-variable-5': '54.64', 'indep-variable-2': '98.01', 'indep-variable-3': '0.432', 'indep-variable-1': '23.45'}]}, {'dependent-variable': 'dep-variable-4', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '32', 'indep-variable-4': '342', 'indep-variable-5': '66.67', 'indep-variable-2': '95.96', 'indep-variable-3': '0.743', 'indep-variable-1': '22.1'}]}, {'dependent-variable': 'dep-variable-5', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '27', 'indep-variable-4': '295', 'indep-variable-5': '55.83', 'indep-variable-2': '95.03', 'indep-variable-3': '0.488', 'indep-variable-1': '23.27'}]}, {'dependent-variable': 'dep-variable-3', 'error': None, 'independent-variables': [{'indep-variable-6': '0.002', 'indep-variable-7': '26', 'indep-variable-4': '427', 'indep-variable-5': '75.45', 'indep-variable-2': '101.21', 'indep-variable-3': '0.832', 'indep-variable-1': '22.67'}]}, {'dependent-variable': 'dep-variable-5', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '29', 'indep-variable-4': '303', 'indep-variable-5': '58.88', 'indep-variable-2': '97.78', 'indep-variable-3': '0.638', 'indep-variable-1': '19.99'}]}, {'dependent-variable': 'dep-variable-5', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '27', 'indep-variable-4': '295', 'indep-variable-5': '55.83', 'indep-variable-2': '95.03', 'indep-variable-3': '0.488', 'indep-variable-1': '23.27'}]}, {'dependent-variable': 'dep-variable-4', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '30', 'indep-variable-4': '342', 'indep-variable-5': '75.67', 'indep-variable-2': '99.33', 'indep-variable-3': '0.648', 'indep-variable-1': '20.71'}]}, {'dependent-variable': u'dep-variable-1', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.002', u'indep-variable-7': u'23', u'indep-variable-4': u'325', u'indep-variable-5': u'56.64', u'indep-variable-2': u'98.01', u'indep-variable-3': u'0.432', u'indep-variable-1': u'23.45'}]}, {'dependent-variable': u'dep-variable-4', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'32', u'indep-variable-4': u'342', u'indep-variable-5': u'66.67', u'indep-variable-2': u'95.96', u'indep-variable-3': u'0.743', u'indep-variable-1': u'22.1'}]}, {'dependent-variable': u'dep-variable-5', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'27', u'indep-variable-4': u'295', u'indep-variable-5': u'55.83', u'indep-variable-2': u'95.03', u'indep-variable-3': u'0.488', u'indep-variable-1': u'23.27'}]}, {'dependent-variable': u'dep-variable-3', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.002', u'indep-variable-7': u'26', u'indep-variable-4': u'427', u'indep-variable-5': u'75.45', u'indep-variable-2': u'101.21', u'indep-variable-3': u'0.832', u'indep-variable-1': u'22.67'}]}, {'dependent-variable': u'dep-variable-5', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'29', u'indep-variable-4': u'303', u'indep-variable-5': u'58.88', u'indep-variable-2': u'97.78', u'indep-variable-3': u'0.638', u'indep-variable-1': u'19.99'}]}, {'dependent-variable': u'dep-variable-1', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'27', u'indep-variable-4': u'295', u'indep-variable-5': u'55.83', u'indep-variable-2': u'95.03', u'indep-variable-3': u'0.488', u'indep-variable-1': u'23.27'}]}, {'dependent-variable': u'dep-variable-1', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'30', u'indep-variable-4': u'342', u'indep-variable-5': u'75.67', u'indep-variable-2': u'99.33', u'indep-variable-3': u'0.648', u'indep-variable-1': u'20.71'}]}, {u'dependent-variable': u'dep-variable-1', u'independent-variables': [{u'indep-variable-6': 0.002, u'indep-variable-7': 25, u'indep-variable-4': 325, u'indep-variable-5': 54.64, u'indep-variable-2': 98.01, u'indep-variable-3': 0.432, u'indep-variable-1': 23.45}]}, {u'dependent-variable': u'dep-variable-2', u'independent-variables': [{u'indep-variable-6': 0.001, u'indep-variable-7': 31, u'indep-variable-4': 235, u'indep-variable-5': 64.45, u'indep-variable-2': 92.22, u'indep-variable-3': 0.356, u'indep-variable-1': 24.32}]}, {u'dependent-variable': u'dep-variable-3', u'independent-variables': [{u'indep-variable-6': 0.002, u'indep-variable-7': 24, u'indep-variable-4': 427, u'indep-variable-5': 75.45, u'indep-variable-2': 101.21, u'indep-variable-3': 0.832, u'indep-variable-1': 22.67}]}, {u'dependent-variable': u'dep-variable-1', u'independent-variables': [{u'indep-variable-6': 0.002, u'indep-variable-7': 23, u'indep-variable-4': 325, u'indep-variable-5': 54.64, u'indep-variable-2': 98.01, u'indep-variable-3': 0.432, u'indep-variable-1': 23.45}]}, {u'dependent-variable': u'dep-variable-4', u'independent-variables': [{u'indep-variable-6': 0.001, u'indep-variable-7': 32, u'indep-variable-4': 342, u'indep-variable-5': 66.67, u'indep-variable-2': 95.96, u'indep-variable-3': 0.743, u'indep-variable-1': 22.1}, {u'indep-variable-6': 0.001, u'indep-variable-7': 30, u'indep-variable-4': 342, u'indep-variable-5': 75.67, u'indep-variable-2': 99.33, u'indep-variable-3': 0.648, u'indep-variable-1': 20.71}]}, {u'dependent-variable': u'dep-variable-5', u'independent-variables': [{u'indep-variable-6': 0.001, u'indep-variable-7': 27, u'indep-variable-4': 295, u'indep-variable-5': 55.83, u'indep-variable-2': 95.03, u'indep-variable-3': 0.488, u'indep-variable-1': 23.27}, {u'indep-variable-6': 0.001, u'indep-variable-7': 27, u'indep-variable-4': 295, u'indep-variable-5': 55.83, u'indep-variable-2': 95.03, u'indep-variable-3': 0.488, u'indep-variable-1': 23.27}, {u'indep-variable-6': 0.001, u'indep-variable-7': 29, u'indep-variable-4': 303, u'indep-variable-5': 58.88, u'indep-variable-2': 97.78, u'indep-variable-3': 0.638, u'indep-variable-1': 19.99}]}, {u'dependent-variable': u'dep-variable-3', u'independent-variables': [{u'indep-variable-6': 0.002, u'indep-variable-7': 26, u'indep-variable-4': 427, u'indep-variable-5': 75.45, u'indep-variable-2': 101.21, u'indep-variable-3': 0.832, u'indep-variable-1': 22.67}]}]}

jeff1evesque commented 7 years ago

Using the same temporary snippet in our base_data.py, we were able to run a fresh data_append session, with multiple input datasets:

multiple-input-append

We noticed the following within our error.log:

[2017-07-24 18:41:16,875] {/vagrant/log/logger.py:165} DEBUG - brain.session.base_data: /brain/session/base_data.py, self.dataset: {'properties': {'model_type': u'svm', 'dataset_type': u'file_upload', 'collection': u'collection-621', 'stream': False, 'session_type': u'data_append'}, 'dataset': [{'dependent-variable': 'dep-variable-1', 'error': None, 'independent-variables': [{'indep-variable-6': '0.002', 'indep-variable-7': '25', 'indep-variable-4': '325', 'indep-variable-5': '54.64', 'indep-variable-2': '98.01', 'indep-variable-3': '0.432', 'indep-variable-1': '23.45'}]}, {'dependent-variable': 'dep-variable-2', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '31', 'indep-variable-4': '235', 'indep-variable-5': '64.45', 'indep-variable-2': '92.22', 'indep-variable-3': '0.356', 'indep-variable-1': '24.32'}]}, {'dependent-variable': 'dep-variable-3', 'error': None, 'independent-variables': [{'indep-variable-6': '0.002', 'indep-variable-7': '24', 'indep-variable-4': '427', 'indep-variable-5': '75.45', 'indep-variable-2': '101.21', 'indep-variable-3': '0.832', 'indep-variable-1': '22.67'}]}, {'dependent-variable': 'dep-variable-1', 'error': None, 'independent-variables': [{'indep-variable-6': '0.002', 'indep-variable-7': '23', 'indep-variable-4': '325', 'indep-variable-5': '54.64', 'indep-variable-2': '98.01', 'indep-variable-3': '0.432', 'indep-variable-1': '23.45'}]}, {'dependent-variable': 'dep-variable-4', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '32', 'indep-variable-4': '342', 'indep-variable-5': '66.67', 'indep-variable-2': '95.96', 'indep-variable-3': '0.743', 'indep-variable-1': '22.1'}]}, {'dependent-variable': 'dep-variable-5', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '27', 'indep-variable-4': '295', 'indep-variable-5': '55.83', 'indep-variable-2': '95.03', 'indep-variable-3': '0.488', 'indep-variable-1': '23.27'}]}, {'dependent-variable': 'dep-variable-3', 'error': None, 'independent-variables': [{'indep-variable-6': '0.002', 'indep-variable-7': '26', 'indep-variable-4': '427', 'indep-variable-5': '75.45', 'indep-variable-2': '101.21', 'indep-variable-3': '0.832', 'indep-variable-1': '22.67'}]}, {'dependent-variable': 'dep-variable-5', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '29', 'indep-variable-4': '303', 'indep-variable-5': '58.88', 'indep-variable-2': '97.78', 'indep-variable-3': '0.638', 'indep-variable-1': '19.99'}]}, {'dependent-variable': 'dep-variable-5', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '27', 'indep-variable-4': '295', 'indep-variable-5': '55.83', 'indep-variable-2': '95.03', 'indep-variable-3': '0.488', 'indep-variable-1': '23.27'}]}, {'dependent-variable': 'dep-variable-4', 'error': None, 'independent-variables': [{'indep-variable-6': '0.001', 'indep-variable-7': '30', 'indep-variable-4': '342', 'indep-variable-5': '75.67', 'indep-variable-2': '99.33', 'indep-variable-3': '0.648', 'indep-variable-1': '20.71'}]}, {u'dependent-variable': u'dep-variable-1', u'independent-variables': [{u'indep-variable-6': 0.002, u'indep-variable-7': 23, u'indep-variable-4': 325, u'indep-variable-5': 54.64, u'indep-variable-2': 98.01, u'indep-variable-3': 0.432, u'indep-variable-1': 23.45}]}, {u'dependent-variable': u'dep-variable-4', u'independent-variables': [{u'indep-variable-6': 0.001, u'indep-variable-7': 32, u'indep-variable-4': 342, u'indep-variable-5': 66.67, u'indep-variable-2': 95.96, u'indep-variable-3': 0.743, u'indep-variable-1': 22.1}, {u'indep-variable-6': 0.001, u'indep-variable-7': 30, u'indep-variable-4': 342, u'indep-variable-5': 75.67, u'indep-variable-2': 99.33, u'indep-variable-3': 0.648, u'indep-variable-1': 20.71}]}, {u'dependent-variable': u'dep-variable-5', u'independent-variables': [{u'indep-variable-6': 0.001, u'indep-variable-7': 27, u'indep-variable-4': 295, u'indep-variable-5': 55.83, u'indep-variable-2': 95.03, u'indep-variable-3': 0.488, u'indep-variable-1': 23.27}, {u'indep-variable-6': 0.001, u'indep-variable-7': 27, u'indep-variable-4': 295, u'indep-variable-5': 55.83, u'indep-variable-2': 95.03, u'indep-variable-3': 0.488, u'indep-variable-1': 23.27}, {u'indep-variable-6': 0.001, u'indep-variable-7': 29, u'indep-variable-4': 303, u'indep-variable-5': 58.88, u'indep-variable-2': 97.78, u'indep-variable-3': 0.638, u'indep-variable-1': 19.99}]}, {u'dependent-variable': u'dep-variable-3', u'independent-variables': [{u'indep-variable-6': 0.002, u'indep-variable-7': 26, u'indep-variable-4': 427, u'indep-variable-5': 75.45, u'indep-variable-2': 101.21, u'indep-variable-3': 0.832, u'indep-variable-1': 22.67}]}, {'dependent-variable': u'dep-variable-1', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.002', u'indep-variable-7': u'23', u'indep-variable-4': u'325', u'indep-variable-5': u'56.64', u'indep-variable-2': u'98.01', u'indep-variable-3': u'0.432', u'indep-variable-1': u'23.45'}]}, {'dependent-variable': u'dep-variable-4', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'32', u'indep-variable-4': u'342', u'indep-variable-5': u'66.67', u'indep-variable-2': u'95.96', u'indep-variable-3': u'0.743', u'indep-variable-1': u'22.1'}]}, {'dependent-variable': u'dep-variable-5', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'27', u'indep-variable-4': u'295', u'indep-variable-5': u'55.83', u'indep-variable-2': u'95.03', u'indep-variable-3': u'0.488', u'indep-variable-1': u'23.27'}]}, {'dependent-variable': u'dep-variable-3', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.002', u'indep-variable-7': u'26', u'indep-variable-4': u'427', u'indep-variable-5': u'75.45', u'indep-variable-2': u'101.21', u'indep-variable-3': u'0.832', u'indep-variable-1': u'22.67'}]}, {'dependent-variable': u'dep-variable-5', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'29', u'indep-variable-4': u'303', u'indep-variable-5': u'58.88', u'indep-variable-2': u'97.78', u'indep-variable-3': u'0.638', u'indep-variable-1': u'19.99'}]}, {'dependent-variable': u'dep-variable-1', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'27', u'indep-variable-4': u'295', u'indep-variable-5': u'55.83', u'indep-variable-2': u'95.03', u'indep-variable-3': u'0.488', u'indep-variable-1': u'23.27'}]}, {'dependent-variable': u'dep-variable-1', 'error': None, 'independent-variables': [{u'indep-variable-6': u'0.001', u'indep-variable-7': u'30', u'indep-variable-4': u'342', u'indep-variable-5': u'75.67', u'indep-variable-2': u'99.33', u'indep-variable-3': u'0.648', u'indep-variable-1': u'20.71'}]}]}

jeff1evesque commented 7 years ago

After a submitting a model_generate session, our model_predict session was able to provide an option, for the corresponding model. However, selecting the corresponding model for prediction, always resulted in the form not updating to the chosen model, for prediction:

screen shot 2017-07-24 at 11 54 49 pm

Therefore, we'll need to investigate the following scenarios:

reactjs code not being able to update the webform (perhaps related to ajax process)
redis cache not properly storing, or retrieving properties from the cached model_generate session
python backend code, not being able connect flask route, with corresponding database query

jeff1evesque commented 7 years ago

64c4559: in the future, we could ensure the operating user (whether logged-in, or anonymous), do not save a collection by a name of an existing collection, associated with their account. This will guarantee uniqueness, with respect to a corresponding model_predict.jsx session:

{/* array components require unique 'key' value */}
{options && options.map(function(value) {
    return <option key={value.collection} value={value.collection}>
        {value.collection}
    </option>;
})}