Remove raw_data_dict reference, add some tests

calpoly-csai / api

Official API for the NIMBUS Voice Assistant accessible via HTTP REST protocol.

https://nimbus.api.calpolycsai.com/

GNU General Public License v3.0

9 stars 4 forks source link

Remove raw_data_dict reference, add some tests #114

Closed Jason-Ku closed 4 years ago

Jason-Ku commented 4 years ago

What's New?

Stray raw_data_dict reference replaced with data_dict. Should prevent the api from crashing if trying to save an AudioSampleMetaData Entity.
Set up some unittests for insert_entity and update_entity
Some formatting to pass pep8

Type of change (pick-one)

[x] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] This change requires a documentation update

How Has This Been Tested?

python -m pytest

Checklist (check-all-before-merge)

formatting help: - [x] means "checked' and - [ ] means "unchecked"

[x] I documented my code according to the Google Python Style Guide
[ ] I ran ./build_docs.sh and the docs look fine
[ ] I ran ./type_check.sh and got no errors
[ ] I ran ./format.sh because it automatically cleans my code for me 😄
[ ] I ran ./lint.sh to check for what "format" missed
[x] I added my tests to the /tests directory
[ ] I ran ./run_tests.sh and all the tests pass

Jason-Ku commented 4 years ago

Re-ran the tests after merging Michael's PR to install pytests, and the new tests I've written are failing due to something about a config.json. Anyone know what that's about? The error in question:

 with open(config_file) as json_data_file:
        FileNotFoundError: [Errno 2] No such file or directory: 'config.json'

Also the other four tests that were previously there are failing since I believe they cannot connect to the DB. Do we want to have pytest running those tests during GitHub Action's run_tests check? Do we want them at all?

mfekadu commented 4 years ago

related to #115

mfekadu commented 4 years ago

@Jason-Ku

You are right.

https://github.com/calpoly-csai/api/blob/cc1b19457362be06943026e446d3a8738d26804f/database_wrapper.py#L339-L340

The NimbusMySQLAlchemy class expects to initialize SQLAlchemy with the connection information found inside of config.json

The GitHub Action for testing fails because there only exists a config_SAMPLE.json but no config.json for security & privacy reasons.

I am not sure of the best resolution to this.

What are your thoughts on the following options:

mocking out the __init__ of NimbusMySQLAlchemy.
- After all, the tests do not need a live connection to the DB, right?
- Is mocking __init__ even possible?
refactoring our code to not depend on config.json
- After all, the database_wrapper.py file is quite large and only getting larger. Is that a code-smell? Where would we begin to break up the NimbusMySQLAlchemy class? How would refactoring affect the NimbusDatabase abstract class?
updating the GitHub Action for testing to simply run mv config_SAMPLE.json config.json
- the easiest choice, honestly.

mfekadu commented 4 years ago

More on solution # 3

python code


```python
RDBMS = "mysql"
PIP_PACKAGE = "mysqlconnector"  # pip install mysql-connector-python
mysql_config = {
    "user": "username",
    "password": "password",
    "host": "localhost",
    "port": "3306",
    "database": "dev"
}
SQLALCHEMY_DATABASE_URI = "{}+{}://{}:{}@{}:{}/{}".format(
    RDBMS,
    PIP_PACKAGE,
    mysql_config["user"],
    mysql_config["password"],
    mysql_config["host"],
    mysql_config["port"],
    mysql_config["database"]
)
from sqlalchemy import create_engine, inspect
engine = create_engine(SQLALCHEMY_DATABASE_URI)
print(engine)
```

python repl output


```python
>>> RDBMS = "mysql"
>>> PIP_PACKAGE = "mysqlconnector"
>>> mysql_config = {
...     "user": "username",
...     "password": "password",
...     "host": "localhost",
...     "port": "3306",
...     "database": "dev"
... }
>>> SQLALCHEMY_DATABASE_URI = "{}+{}://{}:{}@{}:{}/{}".format(
...     RDBMS,
...     PIP_PACKAGE,
...     mysql_config["user"],
...     mysql_config["password"],
...     mysql_config["host"],
...     mysql_config["port"],
...     mysql_config["database"]
... )
>>> from sqlalchemy import create_engine, inspect
>>> engine = create_engine(SQLALCHEMY_DATABASE_URI)
>>> print(engine)
Engine(mysql+mysqlconnector://username:***@localhost:3306/dev)
>>>
```

requirements for running the sample code above:


```bash
pip install mysql-connector-python && pip install SQLAlchemy
```

Jason-Ku commented 4 years ago

Yes! It is possible to mock out init . I think this is the way to go. Even though the Engine and Pool objects are lazily initialized in create_engine (https://docs.sqlalchemy.org/en/13/core/engines.html), init still relies on a valid connection to the DB. For the tests, we would infinitely prefer not requiring a connection to the DB at all, so I think we need to either mock out init , or refactor the engine-creation bit into a separate function and mock that out instead
I think it's alright to keep using config.json - a refactor doesn't seem quite necessary yet but we're definitely due for some code cleanup and optimization. Personally, I don't think the file itself is too large, considering a lot of it is documentation (which is awesome)
This would let the tests get to the next hurdle but I believe we would run into issues in create_engine, as discussed in 1.

Jason-Ku commented 4 years ago

The more I think about it, the better pulling out the engine-creation part of init seems. I think it would result in an incredibly clean init method, and we could mock out the engine creation, removing the need for adding config.json to Github Actions. For example:

def __init__(self, config_file: str = "config.json") -> None:
    self.engine = _create_engine(config_file)
    self.Clubs = Clubs
    self.Sections = Sections
    self.Calendars = Calendars
    self.Courses = Courses
    self.Professors = Professors
    self.AudioSampleMetaData = AudioSampleMetaData
    self.Locations = Locations
    self.QuestionAnswerPair = QuestionAnswerPair
    self.inspector = inspect(self.engine)
    self._create_database_session()
    print("initialized NimbusMySQLAlchemy")

where _create_engine is something like:
def _create_engine(config_file: str) -> Engine
    with open(config_file) as json_data_file:
            config = json.load(json_data_file)

    if config.get("mysql", False):
            mysql_config = config["mysql"]
            RDBMS = "mysql"
            PIP_PACKAGE = "mysqlconnector"
            SQLALCHEMY_DATABASE_URI = "{}+{}://{}:{}@{}:{}/{}".format(
                RDBMS,
                PIP_PACKAGE,
                mysql_config["user"],
                mysql_config["password"],
                mysql_config["host"],
                mysql_config["port"],
                mysql_config["database"],
            )
            engine = create_engine(SQLALCHEMY_DATABASE_URI)

            if engine is None:
                raise BadConfigFileError("failed to connect to MySQL")
            else:
                return engine
    else:
            msg = "config.json is missing {} field.".format("mysql")
            raise BadConfigFileError(msg)

Jason-Ku commented 4 years ago

You're not doing anything wrong - it's just that there's no array of expected keys for the Calendars Entity in the EXPECTED_KEYS_BY_ENTITY dictionary yet.