Closed konsta-kukkonen closed 1 year ago
Hi Konsta! Thanks for your interest in GLUE. You are right, GLUE does not work well with small sample sizes (i.e., <2,000) because it is an over-parameterized neural network model and definitely needs a reasonable number of samples to train. Nevertheless, there are certain things you can try to make it more small-sample-friendly:
These might increase the possibility of getting a reasonably trained model. Let me know if there were further issues, or if it actually works (I'm also interested 😀)
Thank you for a prompt response Zhi-Jie. I will try your suggestions and report whether it works! :)
I'm slowly moving on with the analysis. I'm wondering how the downscaled parameters can be passed to the scglue.models.fit_SCGLUE() model fitting function.
I tried to define new model by:
my_mod=scglue.models.scglue.SCGLUEModel(adatas={"rna":rna, "atac":atac}, vertices=guidance.nodes, latent_dim=5, h_depth=1, h_dim=32, dropout=0.2, shared_batches=False, random_seed=0)
my_mod.compile()
which worked. But when passing it as a parameter for the fit.SCGLUE function it raises error:
Traceback (most recent call last):
File "./Model_training.py", line 154, in <module>
glue = scglue.models.fit_SCGLUE(
File "path/to/scglue/models/__init__.py", line 204, in fit_SCGLUE
pretrain = model(adatas, sorted(graph.nodes), **pretrain_init_kws)
TypeError: 'SCGLUEModel' object is not callable
I understand that the correct type of the "model" object would be "type" as described in the read the docs documentation page, and the object I created is of "scglue.models.scglue.SCGLUEModel" type. How do I make a model object of the correct type with my selected parameters?
It's possible I'm doing something that is obviously wrong, but as said I'm not very experienced with python, and it has been a learning process even to get to this point. 😅
Thanks, -Konsta
Hi Konsta! As you have found out, the model
argument in the fit_SCGLUE
function only accepts a model type, rather than an already constructed model object.
To tinker the model structure, you can specify the model construction arguments using the init_kws
argument, which will be passed on to construction of model objects inside the fit_SCGLUE
function.
For the above example, you may use something like this:
my_mod = scglue.models.fit_SCGLUE(
adatas={"rna": rna, "atac": atac},
guidance,
init_kws=dict(
latent_dim=5,
h_depth=1,
h_dim=32,
dropout=0.2,
shared_batches=False,
random_seed=0,
)
)
Oh, So those parameters should be passed to init_kws, not to model. Got it! Thank you
Hello and thank you for very interesting software!
The GLUE framework seems to be designed for single-cell omics data in mind. Is it possible to integrate bulk ATAC and RNA sequencing data with GLUE? I'm not very familiar with machine learning, but I understand that the training requires lot of data and you reference in the paper that below 2000 cells the alignment error starts to increase. Is it a lost cause to try to use GLUE with this type of data? I have multiple cell lines, treatments, and replicates from each condition, but not nearly enough samples to resemble anything like a single-cell experiment.
Apologies for my lazines, I haven't tried pre-processing the data in the form that is used as input by GLUE. I haven't used python much and wanted to first get your general opinion.
Best, -Konsta