nestauk / dap_taltech

Tutorials for taltech hack week 2023
MIT License
2 stars 1 forks source link

Set up repo and create utilities library #10

Closed india-kerle closed 1 year ago

india-kerle commented 1 year ago

This pull request starts to address #9 - the DataGetter class does NOT contain all the methods used to load different datasets.

This pull request does the following:

  1. It adds structure to the repo where tutorials will contain the tutorials for taltech hackweek 2023. It currently contains an illustrative directory called text_analysis/ that contains an empty readme.
  2. It creates a utilities library that currently contains a class to load data either from s3 or locally that is relevant across tutorials. you can just pip install -r requirements.txt at the root of the repository to install the utilities library.

To make sure the code runs:

conda create -n dap_taltech python=3.9
pip install git+https://github.com/nestauk/dap_taltech.git@repo_setup

Then:

from dap_taltech.utils.data_getters import DataGetter

dg = DataGetter(local=True)
estonian_patents = dg.get_estonian_patents()