fatiando / harmonica

Forward modeling, inversion, and processing gravity and magnetic data
https://www.fatiando.org/harmonica
BSD 3-Clause "New" or "Revised" License
205 stars 66 forks source link

Change behaviour of dtype on equivalent sources #516

Open santisoler opened 3 weeks ago

santisoler commented 3 weeks ago

Make use of the dtype argument in equivalent sources only to build the jacobian matrix. Stop using the dtype for casting predictions, coordinates and location of the sources. Predicted arrays will by float64, while coordinates and location of the sources will have the same types as the original arguments.

leouieda commented 3 weeks ago

Hi @santisoler, I'm curious about why this is needed.

santisoler commented 3 weeks ago

Hi @leouieda. I didn't quite like that the docstring for dtype weren't reflecting the true behaviour of it. We were using dtype not only for the jacobian and the predictions, but for casting the coordinates of data and points as well. I noticed this because some tests started to fail on Numpy 2.0 (I think because they changed some casting rules when operating with two different type of floats) and the reason for it was because we weren't casting the coordinates of sources every time we created them (see #514).

I fixed that issue in #514 by converting the coordinates of the sources, but it sounded wrong: we were using dtype in a different way as documented. What do you think?

leouieda commented 3 weeks ago

As long as the Jacobian and predictions are still in the correct dtype then it makes sense. Otherwise, it would be better to fix the docs, right?

santisoler commented 3 weeks ago

I think I'd prefer to keep things simple. The main reason to include the dtype is to lower the memory requirements to fit the sources, so the most important place to use it is for building the Jacobian. I don't think there's too much sense to use it to cast the coordinates of the sources.

And now that I think about it, since we don't use the Jacobian to make predictions, once the coefficients are fit, we can just return predictions as float64. Is there any sense to keep the predictions as float32 as well?

leouieda commented 3 weeks ago

I'm not sure. I suppose I thought of the dtype being for the model so then predictions would also be float32. I guess having the dtype impact only the jacobian and not the output of the class feels wrong.