Create function tests for Python

rafapereirabr commented 1 year ago

It might be a good idea to use the pytest testing system.

vss-2 commented 1 year ago

Hello, I would be happy to help!

Pytest offers lots of features to improve testability like fixtures, parameters decorators and marks.

Given that most tests are already written and use the default assert function (which integrates seamlessly with Pytest), I imagine that the suggestion is to modify the tests structures in a manner such as:

Split existing functions (with various asserts) into more specific ones with detailed parameters and error description/logs

Did I get it correctly? If no, what would be the desired improvements?

rafapereirabr commented 1 year ago

Hi @vss-2 . Thanks for your help. Yes, we already have some tests. It would be great to integrate the Python tests with code coverage and github actions. This way, we would (1) have a better understanding of how much of geobr Python code is tested and, (2) more importantly, to have automated checks on commits and pull requests to check if they don't break the Python package. I assume Pytest would be a simple way to do this. rigth ?

vss-2 commented 1 year ago

Absolutely right, I hadn't noticed Python package was missing test coverage. Although I have limited experience with GitHub Actions, I am committed to implement them in my geobr fork. As soon as I successfully complete the setup (or need some help, hahah), I will notify here.

vss-2 commented 1 year ago

Hello, after an extensive solving dependecy issues, I've succesfully established this stable workflow . Before pull-requesting I would like to highlight these two changes:

I've updated Shapely (1.7.0 -> 1.8.1) and Geopandas (0.7 -> 0.11.0) as they were outdated. This was causing broken dependencies across all operating systems (Windows, Linux, MacOS) on Python 3.9 or above but not in Python 3.8 (probably they last supported version);
The list_geobr() function here uses pd.read_html() which requires BeautifulSoup4 and html5lib (were also added to requirements.txt). Initially, this test was failing in all environments. Turns out (some error warned me) the current implementation of BeautifulSoup doesn't fetch and download HTML files, so I've modified this way to maintain the expected behavior;

The current workflow (Python-CMD-check) is configured to:

Check python-package/tests/ for every push event on any branch;
Run test job on the latest versions of: Ubuntu, MacOS and Windows;
For each OS: run tests using Python version 3.9 (*discuss further);
Install specific dependencies (step made specifically because shapely sometimes requires OS-native GEOS lib);
Install pip requirements.txt, save dependencies and start pytest/coverage;
Upload test-results.txt containing coverage info to workflow page (currently we're at 88% total coverage 🥳);

I would like to know what other Python versions should I add to testing workflow. My intention was to add versions from 3.8 to 3.11 (current maintened versions).

rafapereirabr commented 1 year ago

Hi @vss-2 . This looks great! thanks so much. This is a great contribution to make the Python version of geobr more robust. Regarding the versions of Python that should be tested, testing versions from 3.8 to 3.11 is fine. Please go ahead and open the PR.

vss-2 commented 1 year ago

Hello, I'm about to submit the PR. To provide context: I was conducting the final tests before submitting the pull request. It achieved 10 successes and encountered 2 errors due to an unavailable library for Python 3.11 for MacOS and Ubuntu, which I subsequently addressed. During this experiment, the tests took from 5 minutes to 1 hour to execute, possibly due to the increasingly load on IPEA server.

Shortly after solving the 2 errors, I submitted the definitive test and it encountered failures on all versions of MacOS and Ubuntu, this time related to the server (Windows did not present any errors because its data is cached somewhere).

I believe these fails are still related to issue #312 , that is: the current code breaks before trying to reach data from mirrors.

To ensure that no one experiences lag in their work (due to the server workload), later tonight I will submit one more test run.

vss-2 commented 1 year ago

Further follow-up: Last night, I attempted to run the failed jobs. By default, GitHub Actions suggests running all 8 failed executions at once, but as soon as they try downloading the .gpkg files, they failed. Might be IPEA's firewall blacklisting this "suspicious" dozens of same-source requests or GitHub internal tools avoiding network bottleneck. Therefore, I proceeded to manually run them one after the other, which is why there are records of 9 running attempts. In the end, all executions were successful.

rafapereirabr commented 1 year ago

Thank you very much for all the effort, @vss-2 ! I'm accepting th PR now. After merging this PR, we can continue collaborating on other issues with the Python version of geobr. Your contribution is much appreciated !

rafapereirabr commented 11 months ago

Hi @vss-2. This issue has been solved, right?

vss-2 commented 10 months ago

Hello, yes it's solved and may be closed :smile:

rafapereirabr commented 10 months ago

thanks!

ipeaGIT / geobr

Create function tests for Python #322