GenericMappingTools / pygmt

A Python interface for the Generic Mapping Tools.
https://www.pygmt.org
BSD 3-Clause "New" or "Revised" License
758 stars 220 forks source link

Time to split up base_plotting.py into individual module files #685

Closed weiji14 closed 3 years ago

weiji14 commented 4 years ago

Description of the desired feature

The base_plotting.py file (https://github.com/GenericMappingTools/pygmt/blob/v0.2.0/pygmt/base_plotting.py) is the core of PyGMT and home to functions like fig.plot, fig.grdimage and so on. However, it has grown wayyyy too long (~2000 lines) and we're hitting pylint recommended limit (cf https://github.com/GenericMappingTools/pygmt/pull/525#issuecomment-723057170), namely C0302: Too many lines in module.

There will be even more modules coming too! I think we should make things modular and split each pygmt.Figure function into it's own standalone file like so:

Current state

--pygmt
   ├── figure.py
   └── base_plotting.py (contains `plot`, `grdimage`, `meca`, etc)

New state

--pygmt
  ├── src (new folder)
  |        ├── __init__.py
  |        ├── grdimage.py
  |        ├── meca.py
  |        └── *.py  (all other `fig.*` modules split out from base_plotting.py)
  └── figure.py

This is how GMT (see https://github.com/GenericMappingTools/gmt/tree/6.1/src) and GMT.jl (https://github.com/GenericMappingTools/GMT.jl/tree/master/src) organizes their source code too since there's so many GMT modules (https://docs.generic-mapping-tools.org/latest/modules.html).

Note that we originally had a gmt/modules folder but that was removed in 252d5df5f493ca2dafb3bed2a0b6909e37c40578, and the big Figure class was made because of a decision to use an object orientated style (i.e. fig.plot(), fig.grdimage(), etc) at some point in 2017. I believe there should be a way to keep that object orientated style while splitting the class into several files, so this refactor should provide the same functionality for our users going forward.

Transition state

To ease the transition, we can do it chunk by chunk:

  1. Split off the big supplementary modules (i.e. meca)
  2. Split off the grid plotting functions (grdcontour, grdimage, grdview)
  3. etc

Relevant issues:

Are you willing to help implement and maintain this feature? Yes

weiji14 commented 4 years ago

Can someone remind me what the _preprocess function in base_plotting.py does? I never quite understood what it's meant to do :confused:

https://github.com/GenericMappingTools/pygmt/blob/d3e131a863a008f14979e7d9764cc6167156376f/pygmt/base_plotting.py#L29-L55

seisman commented 4 years ago

I think it does nothing here, but the _preprocess function in the Figure class does something.

willschlitzer commented 3 years ago

@weiji14 Please pardon my ignorance on how this should be done, but are you envisioning that figure.py will directly import all of the different plotting methods separately (that have all be moved from base_plotting.py to their separate scripts) to be used by Figure and bypass BasePlotting? As far as I can tell, the tests should be unaffected, as long as pygmt.Figure() is able to use plotting methods.

I'd be happy to help on this issue, but am unfamiliar with the process of moving class methods to separate scripts. Are there any previous commits that demonstrate how this should be done? Thanks!

weiji14 commented 3 years ago

@weiji14 Please pardon my ignorance on how this should be done, but are you envisioning that figure.py will directly import all of the different plotting methods separately (that have all be moved from base_plotting.py to their separate scripts) to be used by Figure and bypass BasePlotting? As far as I can tell, the tests should be unaffected, as long as pygmt.Figure() is able to use plotting methods.

Essentially yes. The intention is to have each function be a standalone thing in one file. We may or may not keep a BasePlotting class intact, the implementation details are quite flexible at the moment.

I'd be happy to help on this issue, but am unfamiliar with the process of moving class methods to separate scripts. Are there any previous commits that demonstrate how this should be done? Thanks!

There's actually a work in progress PR at #686 which started to tackle this, but I got stuck (hitting the limits of my object-orientated Python knowledge) as every stackoverflow question on splitting up a Python class into separate files appears too hacky. You could start having a look there, or feel free to come up with an alternative!

willschlitzer commented 3 years ago

I'll take a look (AKA Google search) for some options. My initial thought when it seemed like too much work to have a Python class in separate files was to have each module be its separate class, but that just seems like it would make Figure too big and cumbersome instead of BasePlotting.

willschlitzer commented 3 years ago

Until this gets changed, can the 2000 line length be dropped from the style checks?

weiji14 commented 3 years ago

Until this gets changed, can the 2000 line length be dropped from the style checks?

You can disable this check on just the one file (base_plotting.py) using # pylint: disable=something where 'something' is the pylint error. I'll try and follow up on #686 PR sometime next month, or feel free (anyone) to have a go at this issue.

michaelgrund commented 3 years ago

Until this gets changed, can the 2000 line length be dropped from the style checks?

You can disable this check on just the one file (base_plotting.py) using # pylint: disable=something where 'something' is the pylint error. I'll try and follow up on #686 PR sometime next month, or feel free (anyone) to have a go at this issue.

Good to know, recently ran into the same issue during the creation of the rose wrapper.

willschlitzer commented 3 years ago

@weiji14 I definitely haven't made any progress on this. What are your thoughts on at least moving the parameters to a separate file, not unlike how we keep common parameters in COMMON_OPTIONS. My thought is that we could organize it somewhat like the test files (e.g. have a parameters_coast.py and parameters_basemap.py). It leaves us with the issue that base_plotting.py has so many functions in the Figure class, but it at least moves hundreds of lines of code out of the class without functional changes to the module.

weiji14 commented 3 years ago

Alright people, #686 is merged so most new plotting modules (i.e. those called using fig.something) should go into the pygmt/src folder whenever possible to keep base_plotting.py short (<3000 lines) This applies to open PRs for rose (#794) and solar (#804), we can open a separate issue to discuss if data processing modules like grd2cpt (#803) needs to go under pygmt/src.

See meca.py example for how it's done. You'll also need to add some lines at two other places like so:

https://github.com/GenericMappingTools/pygmt/blob/23454867d4785112281b397a08d7dac109d28f74/pygmt/src/__init__.py#L5

https://github.com/GenericMappingTools/pygmt/blob/23454867d4785112281b397a08d7dac109d28f74/pygmt/base_plotting.py#L1645-L1646

Sorry if this gives people more work, but it will make pygmt a bit more manageable into the future as we wrap > 100 gmt modules available at https://docs.generic-mapping-tools.org/latest/modules.html.

willschlitzer commented 3 years ago

@GenericMappingTools/python-maintainers I think we should move our existing plotting functions to their own modules, not just the new ones, for the sake of consistency. If this is what we want, I'm happy to work on it; I'm still quarantined for another few days.

seisman commented 3 years ago

You'll also need to add some lines at two other places like so:

https://github.com/GenericMappingTools/pygmt/blob/23454867d4785112281b397a08d7dac109d28f74/pygmt/src/__init__.py#L5

https://github.com/GenericMappingTools/pygmt/blob/23454867d4785112281b397a08d7dac109d28f74/pygmt/base_plotting.py#L1645-L1646

Looking at the changes in #808. It seems boring to modify two files. Perhaps we should revert to @weiji14's initial design (i.e., keep pygmt/src/__init__.py empty and use from pygmt.src.meca import meca in pygmt/base_plotting.py)?

willschlitzer commented 3 years ago

I think think it would be best to have a single import statement, like from pygmt.src import grdimage, plot, solar and the like. It would make the final import statements look cleaner, but the downside is that there is an extra __init__.py.

seisman commented 3 years ago

I think think it would be best to have a single import statement, like from pygmt.src import grdimage, plot, solar and the like. It would make the final import statements look cleaner, but the downside is that there is an extra __init__.py.

It's OK to me.

weiji14 commented 3 years ago

Closed by #808, thanks @willschlitzer! Note that base_plotting.py was deleted, and all modules (e.g. grdimage, colorbar, etc) are imported directly in figure.py now.