Closed siboehm closed 2 years ago
Profile of a training run on LINCS (you can upload this to https://speedscope.app to look at it). There are clearly still a lot of things wrong with the evaluation. Also, the evaluate disentanglement seems to never finish (this is hard to profile, since it's all happening inside some obscure sklearn function).
Profiling run without the disentanglement. I will look closer at this soon. Profile taken at commit df6e375add1983eefb31a7e6a8e5fafa8d576a76. profile_6539366.txt
This was closed by #45
evaluate
still takes ~2h on LINCS (1h for evaluate, 1h for evaluate logfold). This makes it impossible to run the evaluation frequently, instead we can only run the evaluation at the end of a full training run. There is no good reason why this should take 2h, it can probably be done in a few minutes.How to deal with this:
Problems: It's hard to guess how much effort this will be, could be 4h, could be 3 days. For now we can already do some runs with just a single evaluation at the end, though this issue will have to be fixed to enable HParam tuning and creating proper loss plots.