andrewcharlesjones / spatial-alignment

Alignment of spatial genomics data using deep Gaussian processes
MIT License
24 stars 7 forks source link

data simulation available? #1

Closed giovp closed 2 years ago

giovp commented 2 years ago

Hi,

thanks for sharing the code for the paper. I was wondering whether it'd also be possible to share the code for the data simulation steps, in particular this method

https://github.com/andrewcharlesjones/spatial-alignment/blob/8071482f12b7b8212ee3694c3763c01f59160835/experiments/simulations/two_dimensional_warp_magnitude.py#L14

I'd be keen to try out the method on different (larger) simulated data,

Thanks in advance for your help/time!

andrewcharlesjones commented 2 years ago

Hi Giovanni - yes! Sorry about this. I just added those functions to generate the simulated data. Let me know if you have any issues or questions.

(Sorry for the slow response - it appears that I didn't have my notifications configured right so I didn't see this.)

giovp commented 2 years ago

hi, thanks for updating the simulations but still having issues, gpsa/__init__.py imports this https://github.com/andrewcharlesjones/spatial-alignment/blob/e10daa707d67bab60387aef5132c513726b22487/gpsa/__init__.py#L3

which doesn't exist. Also if you check the util module in gpsa it looks completely empty.

Beside that, any chance on making package pip installable (even only fia git, no need to release via pypi). It'd make it useful in an analysis pipeline (right now it's not really usable unless syspath is used which is not advisable).

Asking mainly because I'd like to test a bunch of registration methods and the synthetic generation step that you provide seems fairly advanced an d comprehensive (thanks again for sharing!)

Thanks again for the help!

andrewcharlesjones commented 2 years ago

Sorry about that - and thanks for noticing it. It was an issue with the gitignore. The file gpsa/util/util.py should be there now.

Do you mean a pip-installable package just for the data generation? Sure, that's doable if it's useful!

The basic idea for the data generation functions was to apply different "warping functions" to the spatial coordinates. This could be a Gaussian process, linear model, or something else. E.g. for a GP it would be something like:

import numpy as np
from scipy.stats import multivariate_normal as mvn
from sklearn.gaussian_process.kernels import RBF

grid_size = 20
limits = [0, 10]
x1s = np.linspace(*limits, num=grid_size)
x2s = np.linspace(*limits, num=grid_size)
X1, X2 = np.meshgrid(x1s, x2s)
X_slice1 = np.vstack([X1.ravel(), X2.ravel()]).T
n = len(X_slice1)
K = 0.1 * RBF(length_scale=1)(X_slice1)
X_slice2 = X_slice1 + mvn.rvs(mean=np.zeros(n), cov=K, size=2).T
plt.scatter(X_slice1[:, 0], X_slice1[:, 1])
plt.scatter(X_slice2[:, 0], X_slice2[:, 1])
plt.show()

image

giovp commented 2 years ago

thanks a lot for the help, it works!

Do you mean a pip-installable package just for the data generation? Sure, that's doable if it's useful!

yeah I was thinking that but not super important now that can be run! thanks a lot again!