`tosparse` method for converting to sparse matrix?

twhughes commented 4 years ago

Is there a way to convert a LinearOperator to an explicit sparse matrix representation.

So a sparse equivalent of the todense() method that returns a scipy.sparse.csr_matrix instead of a numpy array, for example?

If not, do you have a suggestion for a workaround? One possibility is to apply the linear operator to each of the (dense) unit vectors and record the indices and entries of the non zeros of the results. Is there a more elegant solution?

Appreciate the help!

twhughes commented 4 years ago

I took a stab at this, see this new method in my fork

https://github.com/twhughes/pylops/blob/master/pylops/LinearOperator.py#L122

It seems to work when I compare to the dense version.

Let me know if you would be interested in merging this into the main repo

mrava87 commented 4 years ago

Hi Tyler, nice :)

On top of my head I cannot think of any other (more efficient) way to do what you want to achieve than what you have done in your fork.

Please go ahead and make a PR. It would be great to have test along the lines of test_dense in https://github.com/equinor/pylops/blob/50916c517aca8c25371394339206eb4e694bb559/pytests/test_linearoperator.py#L45. I think doing it with the Diagonal like in the dense case will be enough to verify that the CSR representation is correct.

twhughes commented 4 years ago

Great! I'll go ahead and write a test and make a PR.

mrava87 commented 4 years ago

Fantastic!

Will merge it first thing next week :)

mrava87 commented 4 years ago

HI @twhughes, I just merged your PR, thanks a lot for your contribution :)

Btw, I looked a bit at your research, really cool stuff. I was wondering if/how you plan to use PyLops? Given your tosparse addition and some of your codes/paper I guess you work a lot with PDEs and direct solvers for very sparse matrices... this made me think that we could overload the tosparse to some operators directly where a more efficient creation is possible.

Just to make an example, if we were to add tosparse to the FirstDerivative that directly creates a banded matrix with -0.5 and +0.5 in the k=-1 and k=1 diagonals and similar to SecondDerivative and Diagonal operator (and perhaps few others), when doing something like:

D=SecondDerivative(N)
W=Diagonal(w*c)
WE = W.tosparse() - D.tosparse()

or even better:

WE = WE.tosparse()

we could create a frequency-domain wave-equation sparse matrix directly from an operator definition with almost no cost as we would not need to probe the columns of the operator with unit vectors. Of course as soon as an operator is used that has no obvious sparse representation for that we would need to rely on what you implemented.

The last line of code (which would be ideal for users) may not be so straightforward to do as we would need to do some checks at the + or - overloaded operators, but I think it is doable if this is of interest for some people. What do you think?

twhughes commented 4 years ago

Hey, no problem. Thanks for merging it in! Pylops seems like a really exciting project, potentially really useful for my work.

I'm a researcher in the field of photonics, so I'm interested in solving maxwell's equations numerically. I have a project called 'ceviche', which is a finite difference, frequency domain solver for maxwells equations. We make a sparse matrix equation Ax=b, which we solve for the electromagnetic fields x given some excitation b. The construction of A involves derivative operators and other sparse operators, so i was thinking of using pylops to make this construction more elegant. Right now we do things by hand and store the sparse matrices explicitly.

One of the main features of ceviche is its use of automatic differentiation. We use the autograd package, as well as our own extensions for sparse matrix algebra, to track the derivatives of each of the steps in the simulation process. This allows us to very flexibly do gradient-based optimization of our simulations. There are some nice examples on the package tutorial, linked in the readme.

I needed thetosparse method to solve this Ax=b problem, we like to do an LU decomposition of A (rather than iterative solve). This is much faster for smaller problems.

Regarding your question, let me see if I understand: You're thinking of overloading the tosparse method for various subclasses of LinearOperator? For example, Diagonal operator would have a very simple sparse matrix form, so the idea is to have these defined explicitly in the subclasses, only relying on the unit vector multiplication trick for more general cases?

I think this is potentially interesting. A few initial thoughts:

It might be useful to benchmark the speed of this tosparse method. I'm not sure if the performance will be so slow that we would need to optimize it.
It can become quite tedious to define these representations by hand, so besides diagonal matrices and derivatives in 1D and 2D, it could be a major project.

Regarding how tosparse would work when creating new linear operators out of existing, more simple ones: I'm not sure because I havent dug around in the code very much. But I'm imagining that one can override the tosparse method when the new linear operator is created.

In pseudocode

    def __add__(self, other):
          result = self + other
          result.tosparse = lambda result: self.tosparse() + other.tosparse()

Basically this would freeze the tosparse methods of the self and other classes into the tosparse method of result.

From your example

WE = WE.tosparse()  # would call W.tosparse() - D.tosparse()

note that this code wont work as written because python wont bind tosparse as a method of result, but there are workarounds, see (here)[https://stackoverflow.com/questions/972/adding-a-method-to-an-existing-object-instance]

Anyway, I think this could be interesting, right now I'm not sure the speed is enough of an issue, but maybe if I run into this problem, I can take a look.

mrava87 commented 4 years ago

Great to hear that, would love to see PyLops used for problems like the one you focus on.

So far it has been mostly used for geophysical applications, just because myself and other contributors work in this area but its building blocks are general enough to be used in other domains.

I really like ceviche, I had a quick look at the tutorial and will give it a go soon as soon as I have a bit of time. I definitely see the value of automatic differentiation especially for PDE-based optimisation where hand-crafting gradients is generally a tedious job - we do that in geophysics and perhaps it is time to switch to AD too...

Regarding my suggestion, you understood correctly. Some operators have an easy to create sparse representation that we could take advantage of. It is not my goal to have a hand-crafted sparse representation for all operators as the idea of linear operators suits most of the things we do in PyLops better than an actual dense/sparse matrix. But if we could do it for those operators that are usually involved in the solution of PDEs (e.g. FirstDerivative, SecondDerivative, Laplacian, Gradient, Curl, Diagonal), I think this could be a great addition. For small problems, as you say, I also believe tosparse as you wrote would work fine and would not be slow. When the model space becomes very large (e.g. 3D applications), then the other route is perhaps more appealing. Anyways, its great if you can experiment with the current setup and see how things behave. We can always work on the special overloading of tosparse for some operators in a second moment.

To you pseudocode: I think something like this may work. We would probably need to inherit some of the private classes of scipy that implement +, -, * (like we do already for LinearOperator) and add a tosparse method that would look pretty much like your lambda function... our __add__ is currently just:

def __add__(self, x):
        return aslinearoperator(super().__add__(x))

as we haven't had the need so far to add anything to the scripy one, but this time may be the case we do need to ;)

twhughes commented 4 years ago

Thanks for the response. If i get time, i'll definitely look into the performance. For your geophysical applications, what are the typical matrix / vector sizes you deal with? I'm finding that this tosparse() takes a few seconds for 10k elements, but gets more like minutes for more than that.

Anyway, really excited about the pylops project, perhaps I'll contribute some stuff in the future! Closing the issue for now :)

mrava87 commented 4 years ago

Really happy of your contribution, and great to hear if/how you end up using PyLops in your project :)

In some of the geophysical applications I work on myself the model size can easily reach millions of parameters (for 3d applications). Those are the problems where I purely use lazy linear operators and the main reason behind the development of pylops.

For FWI-like applications based on PDE-constrained optimisation (which is closer to what you do for example in ceviche) I would say between 10kish and 100kish for 2D applications and again around some millions for 3D applications. In 3D time-domain solvers are generally used because of the high memory requirements involved with freq-domain (although there are cases where people have used FDFD...).

twhughes commented 4 years ago

sounds like really similar scale problems and issues between electromagnetics and geophysics. Definitely a lot of overlap too. curious, have you tried GPU backends for pylops?

mrava87 commented 4 years ago

Yes :) we have a small side project using PyTorch (https://github.com/equinor/pylops-gpu) which was aimed at two things, porting some operators to GPUs where most of the computations can be performed without writing ad-hoc cuda code and getting pylops operators within neural networks (as they are linear the gradient is the adjoint*v so it turned out with a single wrapper we could get any operator in...). I see you also have some experience using PyTorch, any feedback is welcome! PS: your wavetorch project looks really cool, I was thinking of trying it out for geophysical inversion :)

I also played a bit with cupy but at that time I found it a bit immature, I think things have improved a lot so would like to try again.

twhughes commented 4 years ago

pylops GPU looks great! so you observed 50x speedup? I'm starting to consider GPU for a lot of my projects and that makes it a very attractive option.

Thanks for the interest in wavetorch, yea its a differentiable scalar wave equation solver written in pytorch. We used it to optimize the wave propagation domain to do signal processing, but I think you could also use it for some geophysics applications (like infer the density distribution of the surrounding medium based on reflections). You can check out the corresponding paper too if youre interested: https://advances.sciencemag.org/content/5/12/eaay6946.

Really cool stuff you're working on. Is this pylops project a side project or is it something being used for internal projects at equinor?

mrava87 commented 4 years ago

Thanks :) that was my idea, trying to see if we can invert for velocity/density using your code, will read the paper first to understand how things work ;)

Regarding 50x, I think it varies a lot from problem to problem and also what you compare - a moderately small CPU machine with a small GPU (say from Colab) or a very big machine with say a Volta GPU... but for one common geophysical inversion problem where the overall operator can be reconducted to a chain of convolutions/correlations and where the entire process can sit on the gpu (no data movement until the end of the inversion) I saw up to 35x (https://github.com/mrava87/pylops_notebooks/blob/master/developement-cuda/SeismicInversion.ipynb)

I also wished I could do more on this project and get other problems fully onto GPU but had no much time lately... to you last question, this all started mostly as a side project in spare time, we are starting to use bits and pieces in equinor but I won’t say it is used as much as I hoped yet. Things take more time in industry than academia to get adopted ;)

PyLops / pylops

`tosparse` method for converting to sparse matrix? #155