JohnPaton / airbase

🌬 An easy downloader for the AirBase air quality data.
https://airbase.readthedocs.io
MIT License
9 stars 4 forks source link

async request and file writes #11

Closed avaldebe closed 2 years ago

avaldebe commented 2 years ago
JohnPaton commented 2 years ago

Hey, thanks so much for the proposal! As you mentioned in #10 there's quite a lot here that's out of the scope of the PR. Let me go through it a bit. I think the easiest thing will be to split this up into several smaller PRs, and then address issues in each one separately.

avaldebe commented 2 years ago

I'll create a new PR with with only what is needed for using aiohttp and aiofiles.

avaldebe commented 2 years ago

Hi John,

Before I close this PR I would like to know what else would you be interested on bringing over. I can prepare separate PRs for the following topics/issues

  1. Update project configuration
    • Use pyproject.toml and setup.cfg following PEP517
    • Update pytest and coverage configurations and move them to pyproject.toml
    • Configure mypy and isort for development and CI
    • Use tox for testing against different python versions (locally and on CI)
  2. Use pathlib.Path instead of os.path
  3. Add missing type annotations
  4. Use test functions instead of test classes

Cheers, Álvaro.

JohnPaton commented 2 years ago

I see you've started on 2. already, that's perfect. As for the others: I am all in favour (in individual PRs πŸ˜…) except for 4. The classes really only exist as a kind of namespacing for the functions anyway (+ applying fixtures en masse), I think removing them will make the tests less readable. Thoughts?

I've actually moved the CI to GitHub actions and am working on adding mypy there already (via pre-commit), as you may have noticed - thanks for the inspiration.

Really nice that you are cleaning things up around here btw - what are you using the package for?

avaldebe commented 2 years ago

I see you've started on 2. already, that's perfect. As for the others: I am all in favour (in individual PRs :sweat_smile:) except for 4. The classes really only exist as a kind of namespacing for the functions anyway (+ applying fixtures en masse), I think removing them will make the tests less readable. Thoughts?

4.- came as a work around my inability to run individual tests inside the classes (my problem, not the test suite). Since then I found how to do it.

However, IMO modules are better namespaces than classes. Therefore, splitting tests/test_airbase.py into tests/airbase/test_client.py and tests/airbase/test_request.py (containing tests for AirbaseClient and AirbaseRequest respectively) makes sense to me.

I've actually moved the CI to GitHub actions and added MyPy there already (via pre-commit), as you may have noticed - thanks for the inspiration.

πŸ‘πŸΌ

Really nice that you are cleaning things up around here btw - what are you using the package for?

Nothing productive yet. At work we retrieve the current year of observations every night on with a bash script + GNU parallel. Instead, I would like a CLI that stores the files as parquet files partitioned over year/specie/country and an installable intake catalogue (or something similar built on top of DuckDB).

Do you think that any of that would make sense here as extras?

JohnPaton commented 2 years ago

IMO modules are better namespaces than classes

I'm in two minds about this. For libraries I totally agree, but for tests I find it easiest to parse if the test package file structure matches the library file structure, and then all the tests for one class/function are grouped in classes. I'm open to being convinced otherwise though πŸ˜‰

CLI that stores the files as parquet files partitioned over year/specie/country

I'm certainly interested in expanding to a CLI, I was looking at tiangolo/typer for that actually. Additional file formats is also a good idea. I'm not familiar with intake or DuckDB though, I'd need to take a look!