BritishGeologicalSurvey / etlhelper

ETL Helper is a Python ETL library to simplify data transfer into and out of databases.
https://britishgeologicalsurvey.github.io/etlhelper/
GNU Lesser General Public License v3.0
105 stars 25 forks source link

Add typehints #149

Closed leorudczenko closed 1 year ago

leorudczenko commented 1 year ago

Summary

The ETLHelper codebase should include typehints across the board to improve code documentation.

Description

Python typehints define the variable type for variables. This includes stand alone variables, function arguments and function return values.

Depending on the Python version, typehints may require the typing module for full implementation.

There is a helpful tutorial on typehints here.

Acceptance Criteria

volcan01010 commented 1 year ago

For first pass, just do etl.py.

I will add notes on how to type a conn.

volcan01010 commented 1 year ago

Typing for a connection is more difficult. A connection class can be a sqlite3.Connection, a psycopg2.Connection, pyodbc.Connection etc.. If we wanted to define the type possibilities for this, it would require each of the database driver modules to be installed. However, ETLHelper users only want to install the drivers that they are going to use.

We could use a Protocol to define a type that has the same methods and attribute of a DBAPI specified Connection:

https://peps.python.org/pep-0544/

The methods and attributes of a Connection are here:

https://peps.python.org/pep-0249/#connection-objects

We also need to define a type for a chunk (an Iterable of Iterable of Any?.)

Transform is a callable that takes a chunk and returns a chunk.

on_error takes a list of failed rows (each with the row and the exception)

We should also add the types to the docstrings.

volcan01010 commented 1 year ago

Closed by #186