PyLops / pylops

PyLops – A Linear-Operator Library for Python
https://pylops.readthedocs.io
GNU Lesser General Public License v3.0
420 stars 101 forks source link

Split Linear-Operator part from Geophysical part #22

Closed prisae closed 1 year ago

prisae commented 5 years ago

The linear operator part of pylops could be interesting and appeal to a wider community. The geophysical part, however, is more for a specific crowd of geophysicist.

I suggest to remove the geophysical parts and create with it a new package. This might attract more interest in both, pylops from the wider scientific community which would otherwise be put-off by the geophysics-focus, and the new package from the geophysics community which is not necessarily interested in low-level stuff which is present in pylops.

prisae commented 5 years ago

A first go at separating:

Things to keep:

Things to remove:

mrava87 commented 5 years ago

Thanks for the suggestion, it definitely makes sense to do that at some point.

Let me try to come up with a roadmap (in the documentation) with directions of where I see things evolving in various modules - with links to issues where more details are provided - which is something I plan to do anyways.

Maybe we can then use it decide at which stage it makes more sense to make this divorce happen?

prisae commented 5 years ago

Sounds good. Maybe it could be added from now on to each newly added thing if this would belong into the geophysics-part. So at least for everything new it is clear if it would remain in pylops or not.

mrava87 commented 5 years ago

Sounds like a good idea. Will try to label various pull requests if purely adding generic operators or geo related) (or add comments to each operator when multiple are added in one go)

ivasconcelosUU commented 5 years ago

I think this may be a good idea, but perhaps one for when the project grows a little more and becomes more consolidated.

For the time being, the project is new and has still much traction to gain: that will probably come from within the geoscience/inverse problems and related applications community that are within our network connections. If we separate the basic operator parts from applications completely now, I think it may lead to additional issues for us to think about:

But I think the suggestion is still a good one. With maybe a couple of immediate points we could work on:

This, of course, is just my two cents. ;-) Just think we need to promote and grow the project a bit, let it gain/build its community and identity, before any such major redesign/changes.

mrava87 commented 5 years ago

I agree with these points :) I would say to make this concrete we can create some goals and once they are met we can make the split happen:

To the point of ‘reorganizing the documentation’, let me have a go in coming weeks and then will ask you guys to take a look and see what you think :)

prisae commented 5 years ago

I agree with @ivasconcelosUU's view, but I think it still holds what I said earlier: To ensure from now on (specifically for new commits) a clear separation or documentation in the code base of what would belong into a pylops and what would belong into a new package. This way it will be much easier in the future to make this split.

mrava87 commented 5 years ago

Sounds good. Let’s try to make pull requests that don’t mix basic stuff and geostuff (no need to be super-strict if one of the two is minor compared to the other...) and tag it as ‘Core - bla bla’ or ‘Geo - bla bla’ and if we get something from another domain we can do ‘Medical - bla bla’ as well :)

I will actually open a new issue for solvers. I think as this split will break poeple’s codes (even if only for import hopefully) we can take it as excuse to bump version to 2.0 when we do it and make some major changes to the solvers as we will get more and more with time anyways. I want to change them from def to class and have a scikit-learn approach, this has the advantage in our setup that you can set the problem once and play with solvers parameters many times without having to spend time re-creating augmented operators and data (for example in the normal equations or regularized inversion part)...

leouieda commented 5 years ago

Jumping in here since I'm interested in using pylops in the future: I'd definitely be interested in this having a narrower scope. But I would suggest splitting it up as early as possible. As the project attracts more users, it will be harder to make backwards incompatible changes without irritating the users. This is specially true for other libraries that rely on pylops.

prisae commented 5 years ago

I second that. PyLops starts getting traction on SWUNG, and comes up sometimes in SciPy etc. Splitting-up down the road might become a nightmare.

mrava87 commented 5 years ago

Hi Leo, great :)

Let's do this, I can soon come up with a suggested split and rough API interface for both core pylops and geopart (e.g. geolops). We can iterate a bit on this with everyone involved in this issue and once we are all happy we can mark this point as a jumpy to v2.0.0.

After that v1... will keep leaving for some time backporting bug fixes and v2 will progress its own life as well as the indipendent geolops.

Making this split would reduce some of the extra dependencies of pylops (mandatory ones are already really just numpy and scipy) and help avoiding increasing the scope of the core library - which we can keep as discipline agnostic as possible.

What do you think?

mrava87 commented 5 years ago

Sorry I have not managed to come up with a suggestion yet.

I am inclined to wait until the paper that we submitted to SoftwareX is accepted before making any major change (including bumping version to 2.0.0) but I will come up with an initial proposal about this soon :)

leouieda commented 4 years ago

Hi @mrava87 I've been revisiting this again as I start to think more deeply about the linear algebra for my current projects. Any thoughts on the split? PyLops looks really promising and I'd like to get more involved. Personally, I really only need the general purpose linear operators. Mainly so I can do matrix operations that are independent of the backend.

mrava87 commented 4 years ago

Hi @leouieda, Sorry, I have been very slow on this. The main reason is that I would like the paper that was submitted to SoftwareX to be finally accepted and have the version ‘stamped’ by the journal to have the library like it is now. Otherwise it would require changing some figures and text (and another round of reviews, which unfortunately takes ages for this journal...).

After that I am happy to go for the split and make a v2 or the main library, move all the heavy geophysics operators to a separate project.

My proposal would be, keep:

and move out: -avo -waveeqprocessing

There is one more thing I have been wondering for v2 about solvers: so far they are function based, I thought about making them become class based (like svilir-learn), but I haven’t got time to play with this idea and see if it makes any sense to do the change. Any preference? It would be good if it makes sense to change to do it while moving things out and bumping the version major number.

Is this aligned with what you had in mind? If we agree on it then you can start using pylops and know that provided you don’t use any of the operators that will be moved out your codes will not be affected by the change :)

leouieda commented 4 years ago

@mrava87 sounds good to me. Any particular reason for singalprocessing to stick around?

so far they are function based, I thought about making them become class based

I've gone back and forth on this over and over again. In the end, I think it might be best to implement a proof-of-concept for both cases and see what feels better.

More important than that, what are your thoughts on supporting Dask for out-of-core and parallel inversions?

mrava87 commented 4 years ago

@mrava87 sounds good to me. Any particular reason for singalprocessing to stick around?

I think this is basic enough to be useful for larger audience than geoscience.

so far they are function based, I thought about making them become class based

I've gone back and forth on this over and over again. In the end, I think it might be best to implement a proof-of-concept for both cases and see what feels better.

It's two of us here. I like scikit-learn style but I then wonder if it adds anything more to this context... I agree, perhaps trying the other option indipendently and see how it feels is the best before making a final decision. So far I have no capacity to work on this given that the current status works very well anyways, I am happy if anyone else wants to give it a spin or let it sit until I get time myself ;)

More important than that, what are your thoughts on supporting Dask for out-of-core and parallel inversions?

Yes! It's already available here https://github.com/equinor/pylops-distributed :) This is experimental and driven by needs (not all operators of PyLops are also avaiable in their distributed part...), but it seems to work very well and would be happy to give more traction to that. My idea is that this is worth a separate project, 1. to avoid increasing the dependencies of PyLops (not everyone needs, wants to use out-of-core operators), 2. to keep it on 0.x.x version and allow much more freedom to make backward incompatible changes until a more mature stage.

Btw, our paper was finally published, I am happy to move on towards v2 and the split... quite busy in these days so I am not sure about timeframe but hopefully you will have a strip down version in coming months and all the heavy geophysics will move to its indipendent project ;)

leouieda commented 4 years ago

So far I have no capacity to work on this given that the current status works very well anyways, I am happy if anyone else wants to give it a spin or let it sit until I get time myself ;)

Two of us here as well :slightly_smiling_face:

Yes! It's already available here https://github.com/equinor/pylops-distributed :)

Awesome! I'll check it out. I agree that it's worth starting off as a separate project to move fast. Might be worth integrating in the future with Dask as an optional dependency.

Cool stuff! Now I just need some time to play around with code for a change...

mrava87 commented 4 years ago

Great :) happy to hear what you think about pylops-distributed (it’s still very much in experimental phase) and hopefully will get some time in near future to deal with this issue ;)

prisae commented 1 year ago

Did it happen? Cool!

mrava87 commented 1 year ago

No, it did not happen... but given how far we have gone and the very limited interest to split the library in parts I think we should not do it at this point, rather focus on solidifying the components that are more heavily requested and let everything else evolve as time and interest requires ;)

prisae commented 1 year ago

Ah - fair enough!

(Good to have your reasoning here written too, if someone looks for it / for future reference.)