calpoly-csai / api

Official API for the NIMBUS Voice Assistant accessible via HTTP REST protocol.
https://nimbus.api.calpolycsai.com/
GNU General Public License v3.0
9 stars 4 forks source link

[WIP] Adds type checking annotations auto-generated via Instagram's MonkeyType package #108

Open mfekadu opened 4 years ago

mfekadu commented 4 years ago

What's New?

Awesome Code Talk

How did I do this?

Installing MonkeyType

$ pipenv install MonkeyType --dev
$ pipenv shell

Running MonkeyType, which generates a monkeytype.sqlite3

$ monkeytype --verbose run nimbus.py

Then I asked 2 simple questions "What is Foaad's email?" and "Who is the contact for Color Coded?"

Checking for what modules MonkeyType explored while the nimbus.py code was running

$ monkeytype list-modules
QA
database_wrapper
nimbus_nlp.NIMBUS_NLP
nimbus_nlp.question_classifier
nimbus_nlp.save_and_load_model

Telling MonkeyType to generate type annotations

$ monkeytype --verbose stub QA
$ monkeytype --verbose stub database_wrapper
$ monkeytype --verbose stub nimbus_nlp.NIMBUS_NLP
$ monkeytype --verbose stub nimbus_nlp.question_classifier
$ monkeytype --verbose stub nimbus_nlp.save_and_load_model

Telling MonkeyType to edit our code and apply the type annotations

$ monkeytype --verbose apply QA
$ monkeytype --verbose apply database_wrapper
$ monkeytype --verbose apply nimbus_nlp.NIMBUS_NLP
$ monkeytype --verbose apply nimbus_nlp.question_classifier
$ monkeytype --verbose apply nimbus_nlp.save_and_load_model

Addresses #14 but does not fix

Type of change (pick-one)

How Has This Been Tested?

It has not. Let's not merge this until we have tests to verify the same functionality.

However, the auto-generated type annotations can spark discussion on the functions we have written

For example, the monkeytype annotations of these function arguments mismatch the docstring Args in this __init__

-    def __init__(self, q_format, db_query, format_answer):
+    def __init__(self, q_format: str, db_query: partial, format_answer: partial) -> None:
         """
         Args:
             q_format (str): Question format string
@@ -49,13 +51,13 @@ class QA:
         self.db_query = db_query
         self.format_answer = format_answer

https://github.com/calpoly-csai/api/blob/41945a06af197f68719799a6676578922d192ad3/QA.py#L38-L48

MonkeyType noticed a string returned from this function when called just once. Is the return value always a string?

-    def _get_data_from_db(self, extracted_vars):
+    def _get_data_from_db(self, extracted_vars: Dict[str, str]) -> str:
         return self.db_query(extracted_vars)

https://github.com/calpoly-csai/api/blob/41945a06af197f68719799a6676578922d192ad3/QA.py#L54-L55

Ah, nice! It is reassuring to see the docstring match the generated type annotation

-def create_qa_mapping(qa_list):
+def create_qa_mapping(qa_list: List[QA]) -> Dict[str, QA]:
     """
     Creates a dictionary whose values are QA objects and keys are the question
     formats of those QA objects.
@@ -146,18 +148,18 @@ def create_qa_mapping(qa_list):
 #     return functools.partial(_single_var_string_sub, a_format)

https://github.com/calpoly-csai/api/blob/41945a06af197f68719799a6676578922d192ad3/QA.py#L71-L74

hmm.. interesting type: DUMMY_NAME

+from google.cloud.automl_v1.types import PredictResponse
+from monkeytype.encoding import DUMMY_NAME
+from typing import Dict
-    def inline_text_payload(self, sent):
+    def inline_text_payload(self, sent: str) -> Dict[str, DUMMY_NAME]:
         '''
         Converts the input sentence into GCP's callable format

@@ -82,7 +85,7 @@ class Variable_Extraction:

         return {'text_snippet': {'content': sent, 'mime_type': 'text/plain'} }

https://github.com/calpoly-csai/api/blob/41945a06af197f68719799a6676578922d192ad3/nimbus_nlp/NIMBUS_NLP.py#L76-L86

Ah, nice! Another return type that seems legit. Or is it? Haha, I hope so!!

+from sklearn.neighbors.classification import KNeighborsClassifier
-def load_latest_model():
+def load_latest_model() -> KNeighborsClassifier:
     # https://stackoverflow.com/a/39327156
     train_path = PROJECT_DIR + '/models/classification/*'
     list_of_files = glob.glob(train_path)

https://github.com/calpoly-csai/api/blob/41945a06af197f68719799a6676578922d192ad3/nimbus_nlp/save_and_load_model.py#L33

There's more, check out the files changed. I have not gone through everything.

Checklist (check-all-before-merge)

formatting help: - [x] means "checked' and - [ ] means "unchecked"

mfekadu commented 4 years ago

Will do!

By the way, what’s a rebase? @Jason-Ku

I’ve only ever done “resolve conflicts” That’s the extent of my git-knowledge

snekiam commented 4 years ago

@mfekadu Here's github's page on rebase The TL;DR for this situation is that it allows you to squash a bunch of commits into one, which will make the repo cleaner, because then if there's a bug we discover in this code its all in one commit.

There's another page here with some nice ASCII art of some other reasons you might want to use a rebase. Hope this helps 😀