Closed dalonsoa closed 5 days ago
A couple of suggestions:
pyproject.toml
could use a bit of polishing:
setuptools
, hatchling
(for why this is a good idea, I refer to their docs). The current pyproject.toml
, most probably, requires some changes to work correctly. If needed, I can give the extra options that should be added for setuptools
. Changing to hatchling
requires just changing a few lines.pywbt>=0.2
, geopandas>=1
, and shapely>=2
. For example, I am not sure if cdsapi
, fiona
(pyogrio
will be installed with geopandas
), fastparquet
(you already have pyarrow
), GitPython
, and pyflwdir
, are needed. Also, although netcomp
is owned by Barnaby, it's generally not a good idea to have a not-published-on-PyPI dependency. Since its license is MIT, the parts of netcomp
that are needed could just be copied over here as a new module, with credits to the original developer.build-and-inspect-python-package
action.mike
. Although not necessary, a versioned documentation is very useful. Also, I think you need to add site_url
to mkdocs.yaml
.For my suggestions regarding Hatch you can refer to this file and for the publishing workflow you can refer to this file
- Since the repo will be public, it's a good idea to add artifact attestation.
Never heard of it, but sounds like a good idea.
- Switch to using the modern successor of
setuptools
,hatchling
(for why this is a good idea, I refer to their docs). The currentpyproject.toml
, most probably, requires some changes to work correctly. If needed, I can give the extra options that should be added forsetuptools
. Changing tohatchling
requires just changing a few lines.
Switching to Hatch might be a good idea or not - have not read the docs, yet - but that is besides the point of this PR. If we switch to Hatch it should be its own separate PR to modernize the tooling. To be clear, I'm not against switching, but that's a separate issue.
- Dependencies require extra attention and adding some min versions, e.g.,
pywbt>=0.2
,geopandas>=1
, andshapely>=2
. For example, I am not sure ifcdsapi
,fiona
(pyogrio
will be installed withgeopandas
),fastparquet
(you already havepyarrow
),GitPython
, andpyflwdir
, are needed.
No idea about these. @barneydobson ?
Also, although
netcomp
is owned by Barnaby, it's generally not a good idea to have a not-published-on-PyPI dependency. Since its license is MIT, the parts ofnetcomp
that are needed could just be copied over here as a new module, with credits to the original developer.
I hadn't realised that netcomp
is not in PyPI
. PyPI
does not allow GitHub dependencies so the deployment would have failed, anyway. Thanks for pointing this! So, what's the best approach with that one?
- Add a URL section.
OK
- For wheel and dist building, you can use
build-and-inspect-python-package
action.
Cool! Didn't know about this one.
- It's a good idea to have a separate publish workflow that only runs on tagged pushes, so tags on the repo match the Python versions. This way you don't need a dependency on the CI action, since you won't release a tag, unless it's tested thoroughly.
I disagree with this one. Creating a Release that also creates a tag and runs the full suite of tests before publishing to catch last minute errors is preferable than just pushing tags. It has less manual steps, is foolproof and facilitates keeping in sync releases, tags and PyPI.
- For documentation, it seems that you're planning to use GitHub pages, but there's no versioning, e.g. using
mike
. Although not necessary, a versioned documentation is very useful.
I've never found that useful to keep old versions of the documentation alive, but fine. Is not a big effort. Having said that, @barneydobson if you prefer to publish things in Read The Docs, that's also fine, but since WSIMOD is in GitHub Pages, I thought on doing the same.
Also, I think you need to add
site_url
tomkdocs.yaml
.
Ok
- Switch to using the modern successor of
setuptools
,hatchling
(for why this is a good idea, I refer to their docs). The currentpyproject.toml
, most probably, requires some changes to work correctly. If needed, I can give the extra options that should be added forsetuptools
. Changing tohatchling
requires just changing a few lines.
Sorry, you were talking about hatchling, not adopting Hatch in full. Yes, it seems like a good move and part of this PR.
Trying to use hatching and build the wheel also has flagged netcomp
as an issue. It can be bypassed with the right flag, but PyPI will not accept that, so we go back to the beginning: netcomp
cannot be there.
So, what's the best approach with that one?
@dalonsoa Since netcomp
's license is MIT, the parts of it that are needed could just be copied over here as a new module, with credits to the original developer.
It feels a bit hacky... but I guess it would do, if that tool is absolutely necessary.
@dalonsoa netcomp
is not maintained (last commit is for 7 years ago), so I don't think there's any other option.
OK, I've push a few changes after @cheginit suggestions, but there are a few things pending. I feel some of these should be done a separate PR:
netcomp
: This is for @barneydobson mike
to manage versions in the documentation: I've a look at this and its unclear to me how to use it. Getting the version number to use seems a bit hacky. I'd open an issue to implement it in the future, but not on this first version.@dalonsoa I don't think your current setup of adding ci.yml
to release will work, unless you add workflow_call:
to ci.yml
to make it callable from other workflows.
sorry guys I'm away this week but it seems you are ironing out all the issues in here.
RE netcomp
I think @cheginit 's suggestion of copying it over is fine - it is not core to the software so I could also add it to the paper repository if you think that is tidier.
@dalonsoa I don't think your current setup of adding
ci.yml
to release will work, unless you addworkflow_call:
toci.yml
to make it callable from other workflows.
It already had workflow_call
. See https://github.com/ImperialCollegeLondon/SWMManywhere/blob/f3f1fcbcbb94f42125cfefe721e0944da1ae0560/.github/workflows/ci.yml#L3
I've checked all the dependencies and only fiona
, fastparquet
and GitPython
are not required.
OK, one step closer.
I've added netcomp
to swmmanywhere
itself, hopefully in the right way for it to be available when deployed to PyPI. I've disabled running pre-commit on it because it was just failing in everything. That will need to be fixed at some point... but not today. And it is not necessary before release.
I've also removed unused dependencies. Only three in the end.
I had to pin pywbt<=0.1.1
because the newest version, from a few hours ago, was breaking things. @cheginit , I believe you maintain pywbt
, so it would be good if you could have a look at what needs changing.
Still pending adding a lower version limit for some dependencies.
Seems like the pre-commit keeps trying to run in netcomp
. Well, let's see how we can avoid that.
I had to pin
pywbt<=0.1.1
because the newest version, from a few hours ago, was breaking things. @cheginit , I believe you maintainpywbt
, so it would be good if you could have a look at what needs changing.
Ok, I see there's a whole PR about this, so I'll leave that one to be merged first and then I'll remove the pinning in this one.
OK, now there's an integer overflow error. I've no idea where that is coming from or why, so I'll leave this PR on hold until @barneydobson is back.
Things pending:
@dalonsoa The issue with the overflow is due to using fiona
as the engine. I'm guessing the new version of fiona
which was released recently made some changes that cannot parse long integers. The overflow issue is due to the id
column of the subs_rc
which is an integer.
However, if you change the engine to pyogrio
, it won't fail. More importantly, I am not sure why geopandas
is pinned at 0.14, they released v1, back in June, so I highly recommend pining it to '>=1'. The default engine of geopandas
is now pyogrio
.
I just updated geopandas
to their latest version 1.0.1
and all tests passed without changing anything.
Give lower limit to dependencies. I don't feel it is necessary unless we know already some lower versions that will not work.
@dalonsoa a good reason for pinning lower limits is for maintenance, especially geospatial-related packages, can be hassle to debug (the overflow issue being an example), so I think some important packages such as geopandas
, rasterio
, and xarray
need to be pinned to a stable release just to avoid hard to debug issues once this repo becomes public.
Regarding the pywbt
issue, I will take a look and suggest a solution.
Yeah, I totally agree with you. My point is that I do not know what to pin and to what minimum version. You spotted the issue with geopandas
, so now we can pin that one. But I cannot arbitrary pin things without more information.
@dalonsoa The issue with the overflow is due to using
fiona
as the engine. I'm guessing the new version offiona
which was released recently made some changes that cannot parse long integers. The overflow issue is due to theid
column of thesubs_rc
which is an integer.However, if you change the engine to
pyogrio
, it won't fail. More importantly, I am not sure whygeopandas
is pinned at 0.14, they released v1, back in June, so I highly recommend pining it to '>=1'. The default engine ofgeopandas
is nowpyogrio
.I just updated
geopandas
to their latest version1.0.1
and all tests passed without changing anything.
It seems we cannot use geopandas>=1
because the last stable version of osmnx
, 1.9.4, requires an older one. See https://github.com/gboeing/osmnx/blob/715d704b4b93bd29a4d6782923194f4f734658b2/pyproject.toml#L28
@dalonsoa Ah, ok. Yeah, makes sense.
For the pywbt
issue, you just need to replace the code at https://github.com/ImperialCollegeLondon/SWMManywhere/blob/45f9bd1804d8cf654d6ed74de2366bce0b0167fa/swmmanywhere/geospatial_utilities.py#L644-L650 with this:
whitebox_tools(
temp_path,
wbt_args,
save_dir=temp_path,
verbose=verbose(),
wbt_root=temp_path / "WBT",
zip_path=fid.parent / "whiteboxtools_binaries.zip",
max_procs=1,
)
You can also add whiteboxtools_binaries.zip
to .gitignore
.
You also need to replace all instances of np.Inf
with np.inf
since np.Inf
has been deprecated in numpy
. They were the same any way.
I noticed that there's a deprecation warning in tests. This is due to using fillna
on a full dataframe, which is not a good idea anyway, so the code at https://github.com/ImperialCollegeLondon/SWMManywhere/blob/45f9bd1804d8cf654d6ed74de2366bce0b0167fa/swmmanywhere/post_processing.py#L436 should be replaced with:
numeric_cols = shp.select_dtypes(include=[np.number]).columns
shp[numeric_cols] = shp[numeric_cols].fillna(0)
It seems we cannot use geopandas>=1 because the last stable version of osmnx, 1.9.4, requires an older one. See gboeing/osmnx@715d704/pyproject.toml#L28
Ah, I see. It seems to be in preparation for the next release since in the main branch, geopandas
is pinned at >=1
. It seems that 1.9.3
works fine though, since I pinned geopandas>=1
and created a new env by removing the req file (which resulted in installing osmnx=1.9.3
) and all tests passed without any issue. I guess we can just pin geopandas>=1
and osmnx>=1.9.3
, just to be safe.
If we do that, pip-compile
, used to create the various requirements.txt
files will complain since there will be conflicting requirements. geopandas>=1
and osmnx>=1.9.3
might work perfectly fine together, but from the point of view of the dependencies, both are incompatible.
@dalonsoa I see what you mean. Although in my opinion, the trade-off is fine, a workaround is to add pyogrio
as dependency and then throughout the code base, wherever there is gpd.to_file
add the engine="pyogrio"
argument. If there are many such instances, you can just create a partial function once and use it in the codebase.
Ok, this is now working. I had to ditch the requirements files because they could not be produced having incompatible dependency requirements, and update the CI files where relevant.
Anything else you think should be pinned? I think I've fixed all your comments in https://github.com/ImperialCollegeLondon/SWMManywhere/pull/282#issuecomment-2361210862
@dalonsoa Awesome! No, I think the rest of deps are fine. You can also remove [tool.setuptools.packages.find]
and maybe add more classifiers, for example:
classifiers = [
"Development Status :: 4 - Beta",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
"Programming Language :: Python",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Topic :: Scientific/Engineering",
"Topic :: Scientific/Engineering :: GIS",
"Typing :: Typed",
]
Regarding the versioning, if you're planning to release tags, you can use hatchling-vcs
. Then, you just need to remove the version
field and add:
requires = [
"hatch-vcs",
"hatchling",
]
dynamic = [
"version",
]
Then, in __init__.py
add:
from importlib.metadata import PackageNotFoundError, version
try:
__version__ = version("swmmanywhere")
except PackageNotFoundError:
__version__ = "999"
Done. I've removed Python 3.9 from the classifiers as I don't think we are supporting it. I haven't check, though.
Ideally, before a Release we should probably run the tests in a full matrix with all the OS and Python versions that are meant to be supported, but maybe we can leave that for a future improvement.
I've updated the workflows such that:
main
branch, i.e. under normal development conditions, they run only in Windows and Ubuntu with Python 3.10This should give us more confidence that we are deploying something actually tested in the platforms and with the python versions we say this works without any burden for the normal development.
I've removed Python 3.9 from the classifiers as I don't think we are supporting it.
@dalonsoa Right, it makes sense to set the min to 3.10. Many major packages have already dropped support for 3.9 since 3.13 is on the horizon.
The workflow setup is neat! I like it.
@dalonsoa Another thing that I noticed is that pytest
is set to use the default prepend
mode for import. According to the best practices in pytest
's docunmentation, new projects should set "--import-mode=importlib"
. If for some reason you prefer the prepend
method, the src-layout should be used as opposed to the project-layout.
Also, I noticed that -p no:warnings
is there too. It's generally not a good idea to suppress all warnings if they're not being handled by some other machanism. I'd just supress specific warnings that are beyond the scope of the project after seeing them in pytest
outputs, using:
[pytest]
filterwarnings =
ignore:.*message.*
@cheginit @dalonsoa are you both fine to resolve these comments between yourselves and then I can review? I can't say I am following everything in this thread but all seems to be going Ok!
The PR looks good to me, and I don't have any other comments.
I'll tackle the last two remaining comments as soon as possible and then we can call it a day.
OK, all done!!
now that we are not supressing the warnings, as we should have been doing from the beginning, we get 812 warnings when running the test, of all kinds. We should have a look at some point in case there's some low hanging fruit or something really worrying, but that is beyond the scope of this PR.
From my point of view, we are ready to move forward.
@dalonsoa well done!
I skimmed the warnings, the only one that could be in the scope of this PR is adding downloads
mark in pytest
settings and no need to rerun the checks (e.g., adding [skip ci]
to the commit message). There are a couple of low-hanging fruits too, but I agree that they should be addressed in another PR.
Done!
@dalonsoa should I set the repo visibility to public and then merge and this does the rest? or is there something else needed
Both things are necessary... and then create a Release in the repo main page. We can do that together at 10 am or so, if you want to see how it goes and be able of doing it in the future. And cross your fingers!
Having said that, are we waiting to create a release until we have the conda-forge
equivalent workflow? @cheginit
I think @cheginit suggested we do this first - iron out any issues with the documentation etc, and then do it on conda-forge
also @dalonsoa I'm in a meeting until 10 - but I can Teams call you when back at desk?
Sounds good!
Description
Setup the
publish
workflow to deploy SWMManywhere in PyPI whenever there is a new release - with a new tag, important!Essentially, this:
Obviously, this cannot be tested until we merge it and try to publish it, but it is exactly the same workflow used in WSIMOD, which does work.
Fixes #261
Type of change
Key checklist
N/A
python -m pytest
)python -m sphinx -b html docs docs/build
)pre-commit run --all-files
)Further checks
N/A