Very long waiting time for the init setep

AIH-SGML / mixmil

Code for the paper: Mixed Models with Multiple Instance Learning

https://arxiv.org/abs/2311.02455

Apache License 2.0

13 stars 0 forks source link

Very long waiting time for the init setep #12

Closed HelloWorldLTY closed 3 weeks ago

HelloWorldLTY commented 1 month ago

Hi, I found that the initial step of this method is superly long for me. I have been waiting for one hour. Any suggestions here? Are they related to data scales? I have 276400 instances for training here.

jan-engelmann commented 1 month ago

Hi, Can you provide more detail?

Are you talking about the init method of the MixMIL class?
What's your embedding dimension?
How many bags do you have?

If you are standardizing the whole dataset that can take a while.

Profiling your code would help you figure out where most of the compute time is spent.

Cheers!

HelloWorldLTY commented 4 weeks ago

Hi, thanks. I have ~ 2000 bags, but 28000+ features. Do you think the number of features will be a big problem here?

Also, I cannot fit my current model with one A100 80GB GPU.

jan-engelmann commented 3 weeks ago

Hi! in the paper we argue for unsupervised representation learning to reduce the dimensionality. What type of data is this? If it's sc-RNAseq I'd recommend for example scVI to reduce the dimensionality first.

HelloWorldLTY commented 3 weeks ago

Ok, thanks. I tried to reduce the dimensions to 256 dims but still had the same issue. Maybe I can look for other tools for dimension reduction.