Open frank113 opened 1 year ago
This looks very cool. I'm not super familiar with these concepts but will read up when I have time.
FYI, I'll be traveling for most of the next 3 weeks, so I won't be able to do much beyond clicking "merge" from my phone on non-controversial PRs. But I am interested in this and will definitely look into it when possible.
As someone raised with Legos, specifically Lego Star Wars, I appreciate your profile picture on GitHub and Twitter.
Hehe!
@LamAdr, once you're done with the plot functions, maybe you could take a look at this?
The idea is to make it so that the countrycode()
function returns the same type of output as the input it gets: plain list, pandas series, or polars series. Then, we make the pandas and polars dependencies optional, so that users don't have to install all that if they only want to convert lists of codes.
Here's what the AI assistant has to say about making dependencies optional (this makes sense to me, but maybe you can find something better?):
To deal with optional dependencies for optional features in a Python package managed by Poetry, you can declare a dependency group as optional in the
pyproject.toml
file. This makes sense when you have a group of dependencies that are only required in a particular environment or for a specific purpose. You can do this by adding a[tool.poetry.group.<group>]
section where<group>
is the name of your dependency group, and settingoptional = true
. Then, you can add dependencies to this group using the[tool.poetry.group.<group>.dependencies]
section.For example, to add the optional dependency
redis
to a package, you can run the commandpoetry add --optional redis
, which will add the following configuration to yourpyproject.toml
file:[tool.poetry.dependencies] python = "^3.8" [tool.poetry.dependencies.redis] version = "^3.4.1" optional = true
Then, to install the optional dependencies, you can run
poetry install --extras "<group>"
, where<group>
is the name of the optional dependency group. In the case of theredis
example, you would runpoetry install --extras "redis"
.Here's an example of a function that uses an optional dependency if it is available:
def my_function(): try: import optional_dependency except ImportError: print("Optional dependency not found, continuing without it.") else: # Use the optional dependency print("Optional dependency found, using it.")
Sure!
@LamAdr if you have more time, I think it would be great to port some of the R tests to Python. I don't think we need to replicate the full test suite, because we already run it regularly in R. But testing some of the corner cases in particular would be worth it, just because the regex engines differ somewhat.
I would maybe start with these two files:
https://github.com/vincentarelbundock/countrycode/blob/main/tests/testthat/test-corner-cases.R
Sure @vincentarelbundock. Would you prioritize that or the book? In any case, I hope you are not in a hurry because I'm getting examed for a couple weeks, and not sure how much I can get done.
No rush at all. And no priority. You can choose what you work on.
@vincentarelbundock Documenting this discovery as a comment as it falls under the umbrella of testing and impacts how we are doing property-based testing:
Pytest has a number of fixtures available for repeated use across tests. We are currently loading in codelist
from countrycode
for non-Polars
tests and directly for tests involving Polars. As an aside, see comment in issue #8 for an update on enhancements that came out of getting the CI passing. My thoughts regarding these fixtures are to make stylistic changes to the tests to enable re-usability in terms of loaded data and helper functions. These can be scoped to either the entire test suite or to a group of tests by file. In my view the priority of using fixtures is low and can occur after more tests are ported from R.
Yeah, that all sounds good to me. I don't have much experience with unit tests in Python, so I'm perfectly happy to defer to you on almost all of this. Thanks!
Summary
We can use the
hypothesis
package to make unit testing more robust. Property-based testing will allow us to generate examples and edge cases either through random strings or strategies that can be based oncodelist
.For example the
test_invalid_iso3c_to_country_name
function intests/test_basic.py
could be rewritten as follows:codelist
'siso3c
column.countrycode(code, "iso3c", "country.name")
is alwaysNone
for all inputs generated with the above strategyConsiderations
References
Other
As someone raised with Legos, specifically Lego Star Wars, I appreciate your profile picture on GitHub and Twitter.