GenericMappingTools / pygmt

A Python interface for the Generic Mapping Tools.
https://www.pygmt.org
BSD 3-Clause "New" or "Revised" License
758 stars 220 forks source link

Access to load in memory coastline data #1636

Open aragong opened 2 years ago

aragong commented 2 years ago

Description of the desired feature I would like to have a method to obtain coastline data in python memory, as dataframe or similar standard format. I think this method fits perfectly within the section pygmt.datasets like:

coast = pygmt.datasets.load_coastline(resolution="keyword", region="ES")

And the dataframe would look like:

polygon_id longitude latitude area
0 -3.5 43.2 123
0 -3.6 43.5 123
0 -3.5 43.5 123
0 -3.6 43.2 123
1 -3.8 42.2 123
1 -3.9 42.5 123
1 -3.8 42.5 123
1 -3.9 42.2 123
2 ... ... ...
... ... ... ...

I think this feature is a basic that will be used by many people and can improve the visibility/use of pyGMT. At the end of the day, we want a library to access the basics easily (to plot and also load data)

Also, I miss having direct access to GEBCO gridded bathymetry data that in coastal/ocean environments is really demanded.

Are you willing to help implement and maintain this feature? Yes/No I would like to help but I am not sure to have the required knowledge... so I think I can't maintain/implement this feature. But let me know... I am always open to learning!

Thank you for your time, have a nice day!

welcome[bot] commented 2 years ago

👋 Thanks for opening your first issue here! Please make sure you filled out the template with as much detail as possible. You might also want to take a look at our contributing guidelines and code of conduct.

aragong commented 2 years ago

Please check this workaround I've coded to process the GSHHS shapefile using geopandas. Maybe would be useful if you considered this feature request.

repository: coastline-loader

best regards,

maxrjones commented 2 years ago

Hi @aragong, thanks for your feature request and sharing your workaround! Do you mind sharing your use-case for the in-memory coastline data? I am wondering to find out more about how this feature would compare to functions or examples (which could leverage other packages) for working with Open Street Map data in PyGMT?

Also, I miss having direct access to GEBCO gridded bathymetry data that in coastal/ocean environments is really demanded.

Can you clarify what you mean by this? Was direct access previously supported by GEBCO?

aragong commented 2 years ago

Hi @meghanrjones, happy to help with the community.

Do you mind sharing your use-case for the in-memory coastline data?

You have a python notebook called example in the repository to load a custom subset of the EU coastline, please check README and a pre-executed example.ipynb files through previous links. (Also, you have requirements files for pip and conda environments and one test in the folder tests). Please, let me know if after the revision of the example something is not clear... In that case, I will really appreciate your feedback to improve it.

Can you clarify what you mean by this? Was direct access previously supported by GEBCO?

Sure, I miss this functionality because is a very very very common database that we use in my field, but I don't know if GEBCO has any protocol to share this data online like a public opendap or similar... I only want to say that, like the functionalities you have to access STRM databases, GEBCO access would be a very nice feature to achieve. Database-NetCDF size is around 4Gb.

Let me know if you can run de examples, please. Thank you for your comments!

maxrjones commented 2 years ago

You have a python notebook called example in the repository to load a custom subset of the EU coastline, please check README and a pre-executed example.ipynb files through previous links. (Also, you have requirements files for pip and conda environments and one test in the folder tests). Please, let me know if after the revision of the example something is not clear... In that case, I will really appreciate your feedback to improve it.

I look forward to checking it out in more detail. The environment.yml file lists a lot of dependencies. It would be helpful if you could separate out what is needed for using versus developing the package (e.g., pygmt's environment.yml). We're cautious about adding new dependencies to avoid potential conflicts and bloating the required install time/size, which influences how features get implemented in PyGMT.

Sure, I miss this functionality because is a very very very common database that we use in my field, but I don't know if GEBCO has any protocol to share this data online like a public opendap or similar... I only want to say that, like the functionalities you have to access STRM databases, GEBCO access would be a very nice feature to achieve. Database-NetCDF size is around 4Gb.

This may be possible (see https://github.com/GenericMappingTools/remote-datasets/pull/3#issuecomment-984945628). If you want to make a formal feature request for easy access to the GEBCO grid, the gmtserver-admin repository would be the place to do that.

aragong commented 2 years ago

Thanks for your feedback,

The environment.yml file lists a lot of dependencies

You are right! following your advice, I reduce and split it into two files environment.yml and environment_dev.yml. I've created a new tag v0.1.1.

This may be possible (see GenericMappingTools/remote-datasets#3 (comment)). If you want to make a formal feature request for easy access to the GEBCO grid, the gmtserver-admin repository would be the place to do that.

I will do it soon, I hope... thank you!