Closed twhughes closed 4 years ago
I took a stab at this, see this new method in my fork
https://github.com/twhughes/pylops/blob/master/pylops/LinearOperator.py#L122
It seems to work when I compare to the dense version.
Let me know if you would be interested in merging this into the main repo
Hi Tyler, nice :)
On top of my head I cannot think of any other (more efficient) way to do what you want to achieve than what you have done in your fork.
Please go ahead and make a PR. It would be great to have test along the lines of test_dense
in https://github.com/equinor/pylops/blob/50916c517aca8c25371394339206eb4e694bb559/pytests/test_linearoperator.py#L45. I think doing it with the Diagonal like in the dense case will be enough to verify that the CSR representation is correct.
Great! I'll go ahead and write a test and make a PR.
Fantastic!
Will merge it first thing next week :)
HI @twhughes, I just merged your PR, thanks a lot for your contribution :)
Btw, I looked a bit at your research, really cool stuff. I was wondering if/how you plan to use PyLops? Given your tosparse
addition and some of your codes/paper I guess you work a lot with PDEs and direct solvers for very sparse matrices... this made me think that we could overload the tosparse
to some operators directly where a more efficient creation is possible.
Just to make an example, if we were to add tosparse
to the FirstDerivative
that directly creates a banded matrix with -0.5 and +0.5 in the k=-1 and k=1 diagonals and similar to SecondDerivative
and Diagonal
operator (and perhaps few others), when doing something like:
D=SecondDerivative(N)
W=Diagonal(w*c)
WE = W.tosparse() - D.tosparse()
or even better:
WE = WE.tosparse()
we could create a frequency-domain wave-equation sparse matrix directly from an operator definition with almost no cost as we would not need to probe the columns of the operator with unit vectors. Of course as soon as an operator is used that has no obvious sparse representation for that we would need to rely on what you implemented.
The last line of code (which would be ideal for users) may not be so straightforward to do as we would need to do some checks at the +
or -
overloaded operators, but I think it is doable if this is of interest for some people. What do you think?
Hey, no problem. Thanks for merging it in! Pylops seems like a really exciting project, potentially really useful for my work.
I'm a researcher in the field of photonics, so I'm interested in solving maxwell's equations numerically. I have a project called 'ceviche', which is a finite difference, frequency domain solver for maxwells equations. We make a sparse matrix equation Ax=b
, which we solve for the electromagnetic fields x
given some excitation b
. The construction of A
involves derivative operators and other sparse operators, so i was thinking of using pylops to make this construction more elegant. Right now we do things by hand and store the sparse matrices explicitly.
One of the main features of ceviche
is its use of automatic differentiation. We use the autograd
package, as well as our own extensions for sparse matrix algebra, to track the derivatives of each of the steps in the simulation process. This allows us to very flexibly do gradient-based optimization of our simulations. There are some nice examples on the package tutorial, linked in the readme.
I needed thetosparse
method to solve this Ax=b
problem, we like to do an LU decomposition of A
(rather than iterative solve). This is much faster for smaller problems.
Regarding your question, let me see if I understand: You're thinking of overloading the tosparse
method for various subclasses of LinearOperator
? For example, Diagonal
operator would have a very simple sparse matrix form, so the idea is to have these defined explicitly in the subclasses, only relying on the unit vector multiplication trick for more general cases?
I think this is potentially interesting. A few initial thoughts:
tosparse
method. I'm not sure if the performance will be so slow that we would need to optimize it.Regarding how tosparse
would work when creating new linear operators out of existing, more simple ones: I'm not sure because I havent dug around in the code very much. But I'm imagining that one can override the tosparse
method when the new linear operator is created.
In pseudocode
def __add__(self, other):
result = self + other
result.tosparse = lambda result: self.tosparse() + other.tosparse()
Basically this would freeze the tosparse
methods of the self
and other
classes into the tosparse
method of result
.
From your example
WE = WE.tosparse() # would call W.tosparse() - D.tosparse()
note that this code wont work as written because python wont bind tosparse
as a method of result
, but there are workarounds, see (here)[https://stackoverflow.com/questions/972/adding-a-method-to-an-existing-object-instance]
Anyway, I think this could be interesting, right now I'm not sure the speed is enough of an issue, but maybe if I run into this problem, I can take a look.
Great to hear that, would love to see PyLops used for problems like the one you focus on.
So far it has been mostly used for geophysical applications, just because myself and other contributors work in this area but its building blocks are general enough to be used in other domains.
I really like ceviche
, I had a quick look at the tutorial and will give it a go soon as soon as I have a bit of time. I definitely see the value of automatic differentiation especially for PDE-based optimisation where hand-crafting gradients is generally a tedious job - we do that in geophysics and perhaps it is time to switch to AD too...
Regarding my suggestion, you understood correctly. Some operators have an easy to create sparse representation that we could take advantage of. It is not my goal to have a hand-crafted sparse representation for all operators as the idea of linear operators suits most of the things we do in PyLops better than an actual dense/sparse matrix. But if we could do it for those operators that are usually involved in the solution of PDEs (e.g. FirstDerivative, SecondDerivative, Laplacian, Gradient, Curl, Diagonal), I think this could be a great addition. For small problems, as you say, I also believe tosparse
as you wrote would work fine and would not be slow. When the model space becomes very large (e.g. 3D applications), then the other route is perhaps more appealing. Anyways, its great if you can experiment with the current setup and see how things behave. We can always work on the special overloading of tosparse
for some operators in a second moment.
To you pseudocode: I think something like this may work. We would probably need to inherit some of the private classes of scipy that implement +
, -
, *
(like we do already for LinearOperator
) and add a tosparse
method that would look pretty much like your lambda function... our __add__
is currently just:
def __add__(self, x):
return aslinearoperator(super().__add__(x))
as we haven't had the need so far to add anything to the scripy one, but this time may be the case we do need to ;)
Thanks for the response. If i get time, i'll definitely look into the performance. For your geophysical applications, what are the typical matrix / vector sizes you deal with? I'm finding that this tosparse() takes a few seconds for 10k elements, but gets more like minutes for more than that.
Anyway, really excited about the pylops project, perhaps I'll contribute some stuff in the future! Closing the issue for now :)
Really happy of your contribution, and great to hear if/how you end up using PyLops in your project :)
In some of the geophysical applications I work on myself the model size can easily reach millions of parameters (for 3d applications). Those are the problems where I purely use lazy linear operators and the main reason behind the development of pylops.
For FWI-like applications based on PDE-constrained optimisation (which is closer to what you do for example in ceviche
) I would say between 10kish and 100kish for 2D applications and again around some millions for 3D applications. In 3D time-domain solvers are generally used because of the high memory requirements involved with freq-domain (although there are cases where people have used FDFD...).
sounds like really similar scale problems and issues between electromagnetics and geophysics. Definitely a lot of overlap too. curious, have you tried GPU backends for pylops?
Yes :) we have a small side project using PyTorch (https://github.com/equinor/pylops-gpu) which was aimed at two things, porting some operators to GPUs where most of the computations can be performed without writing ad-hoc cuda code and getting pylops operators within neural networks (as they are linear the gradient is the adjoint*v so it turned out with a single wrapper we could get any operator in...). I see you also have some experience using PyTorch, any feedback is welcome! PS: your wavetorch
project looks really cool, I was thinking of trying it out for geophysical inversion :)
I also played a bit with cupy
but at that time I found it a bit immature, I think things have improved a lot so would like to try again.
pylops GPU looks great! so you observed 50x speedup? I'm starting to consider GPU for a lot of my projects and that makes it a very attractive option.
Thanks for the interest in wavetorch, yea its a differentiable scalar wave equation solver written in pytorch. We used it to optimize the wave propagation domain to do signal processing, but I think you could also use it for some geophysics applications (like infer the density distribution of the surrounding medium based on reflections). You can check out the corresponding paper too if youre interested: https://advances.sciencemag.org/content/5/12/eaay6946.
Really cool stuff you're working on. Is this pylops project a side project or is it something being used for internal projects at equinor?
Thanks :) that was my idea, trying to see if we can invert for velocity/density using your code, will read the paper first to understand how things work ;)
Regarding 50x, I think it varies a lot from problem to problem and also what you compare - a moderately small CPU machine with a small GPU (say from Colab) or a very big machine with say a Volta GPU... but for one common geophysical inversion problem where the overall operator can be reconducted to a chain of convolutions/correlations and where the entire process can sit on the gpu (no data movement until the end of the inversion) I saw up to 35x (https://github.com/mrava87/pylops_notebooks/blob/master/developement-cuda/SeismicInversion.ipynb)
I also wished I could do more on this project and get other problems fully onto GPU but had no much time lately... to you last question, this all started mostly as a side project in spare time, we are starting to use bits and pieces in equinor but I won’t say it is used as much as I hoped yet. Things take more time in industry than academia to get adopted ;)
Is there a way to convert a
LinearOperator
to an explicit sparse matrix representation.So a sparse equivalent of the
todense()
method that returns ascipy.sparse.csr_matrix
instead of a numpy array, for example?If not, do you have a suggestion for a workaround? One possibility is to apply the linear operator to each of the (dense) unit vectors and record the indices and entries of the non zeros of the results. Is there a more elegant solution?
Appreciate the help!