technologiestiftung / giessdenkiez-de-dwd-harvester

Gather precipitation data from DWD's radolan data set, for the region of Berlin and connect to the trees DB
https://www.giessdenkiez.de
MIT License
4 stars 9 forks source link

Make input geometry configurable #82

Closed sjockers closed 1 year ago

sjockers commented 1 year ago

Make it easier to use this project for regions other than Berlin by adding .env parameters for configuring the source shapefile for the buffer:

Defaults to previous settings (Berlin shapefile, buffer distance of 2000).

sjockers commented 1 year ago

Unrelated to the actual change: I had to update Shapely in order for the prepare script to properly handle the shapefile (Python 3.9 on a Mac).

ff6347 commented 1 year ago

Very nice. Can you also add the new ENV vars to the test harvest run. Also would be great if the PR would be against the staging branch first.

ff6347 commented 1 year ago

meh. The setup for this is annoying. I will add you as a collaborator so you can run these tests and access the secrets

sjockers commented 1 year ago

Thanks Fabian! I didn't use staging because it wasn't up-to-date with the master branch. But I think I can fix that now myself. I'll take care of it later today.

JensWinter commented 1 year ago

Make it easier to use this project for regions other than Berlin by adding .env parameters for configuring the source shapefile for the buffer:

  • Path to input shapefile

That means we have to add other (source) shape files to the repo, right? Currently there's only Berlin. Or can we also point to remote files (i.e. use urls) this way?

sjockers commented 1 year ago

@JensWinter My idea was exactly that: Fork the repo, add the geometry to the assets folder.

But generally, geopandas.read_file, which is used here, should also work with remote URLs. I haven't tested it though.

JensWinter commented 1 year ago

I'd love to see a solution without having to create a fork. Maybe it's ok to add several different shape files into the assets folder and reference these when executing the script?

Actually I was already working on that approach. Maybe you can check it out when I push my branch? (Probably later today)

A small problem is that in https://github.com/technologiestiftung/giessdenkiez-de-dwd-harvester/blob/ac273c7da839a511fc5956c99c6caf3d25563a6a/.github/workflows/test-harvest.yml#L59 the DWD-Harvester repo itself is referenced. Which makes it difficult to use forks.

After all this is repo is designed as a Github action. Which means it should be able to be reused by others. That's why I think relying on forks should be avoided.

sjockers commented 1 year ago

@JensWinter Ah, I see what you mean. No, this PR does not solve that specific problem. The buffer.shp is still overriden and needs to be checked into the repo, so a fork is indeed necessary.

@ff6347 I've added the variables to the test-harvest action, but as it turns out that's not necessary because prepare.py is not actually run in the action.

The check is failing because this present PR can't access the secrets. I've created a second PR from a branch inside this GH repo, where the check runs through. The code in both PRs is identical, see #85

JensWinter commented 1 year ago

@JensWinter Ah, I see what you mean. No, this PR does not solve that specific problem. The buffer.shp is still overriden and needs to be checked into the repo, so a fork is indeed necessary.

Ah yes, you're right, @sjockers! Well, then never mind.

I'm gonna create an issue then to tackle the main problem. Eventually the harvester should be able to run for any city without having to create forks. It is a github action after all. 😀

ff6347 commented 1 year ago

A small problem is that in run: cd harvester && docker build --tag technologiestiftung/giessdenkiez-de-dwd-harvester:test .

This only builds a docker container with that name on CI. It does not pull anything. The name could be changed.

ff6347 commented 1 year ago

I'm gonna create an issue then to tackle the main problem. Eventually the harvester should be able to run for any city without having to create forks. It is a github action after all. 😀

Yes. I would love that very much. The .shp should be present in the repo where the action is used. Not in the repo of the action.