sustainableaviation / EcoPyLot

🍃🛩️ Prospective environmental and economic life cycle assessment of aircraft made blazing fast
http://ecopylot.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

Create Data Ingestion Pipeline #6

Closed iamsiddhantsahu closed 5 months ago

iamsiddhantsahu commented 7 months ago

The "data ingestion pipeline" should be built to perform these tasks:

  1. Ingest a JSON file, containing parameters (according to the schema defined in https://github.com/sustainableaviation/EcoPyLot/issues/7)
  2. Output a 4-dimensional xarray, containing the dimensions (pax, propulsion, year, parameters).
  3. Output a tuple of data structures, containing the unique values of the 4 dimensions (pax, ...)

In addition, the following utility functionality should be built:

  1. Check for input data validity (valid JSON? corresponding to defined schema?)
  2. logging outputs in every function, including statistics (eg. file loaded, read 778 unique parameters)

[!NOTE] It is useful to think of the 4d xarray as a collection of 3d xarrays:

xarray

michaelweinold commented 7 months ago

...all functions must include typing:

def myfunc(
  input1: str,
  input2: float
) -> pd.DataFrame:

and include extensive docstrings according to the numpy style guide.