Closed KyleFerchen closed 3 years ago
Hi Kyle,
I've seen this sort of issue before -- I think it has something to do with the latent representations being on the simplex and values near 0. You might try adding latent_distribution="normal"
to scvi.model.TOTALVI
or use the metric="correlation"
for example in the call to scanpy's neighbors method.
I also wanted to add that it would take less than 1hr to run totalVI with this many cells on a GPU, and this dataset could be run on Google Colab for free.
Could you confirm that adata.obsm["X_totalVI"]
actually contains NaN values?
Also the fact that you get this warning:
/users/fero3l/.local/lib/python3.7/site-packages/scvi/core/distributions/_negative_binomial.py:519: UserWarning: The value argument must be within the support of the distribution
UserWarning,
implies that your data is not count data.
Thank you, I will first try to get my GPU set up to run it, then try to reproduce the error, confirm the NaN values, and try the suggestions.
I set up PyTorch to work with my GTX 970 GPU. It trained much faster using the 200 epochs and 0.8 training size I used before:
vae.train(n_epochs=200, train_size=0.8)
INFO Training for 200 epochs.
/home/kyle/anaconda3/envs/scvi-env/lib/python3.7/site-packages/scvi/core/distributions/_negative_binomial.py:532: UserWarning: The value argument must be within the support of the distribution
UserWarning,
INFO KL warmup for 47280.75 iterations
Training...: 100%|██████████████████████████████████████████| 200/200 [26:33<00:00, 7.97s/it]
INFO Training is still in warming up phase. If your applications rely on the posterior quality, consider training for more epochs or reducing the kl warmup.
INFO Training time: 1475 s. / 200 epochs
I then confirmed that the .get_latent_representation() is just returning nan values:
vae.get_latent_representation()
array([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], dtype=float32)
I was going to try to set "latent_distribution" to 'normal', but it seems this is already the default, as you can see when I print out the vae object:
vae
TotalVI Model with the following params:
n_latent: 20, gene_dispersion: gene, protein_dispersion: protein, gene_likelihood: nb, latent_distribution: normal
Training status: Trained
To print summary of associated AnnData, use: scvi.data.view_anndata_setup(model.adata)
I'll just try to step through the code to try to find the problem I guess.
A couple of things to look into:
Found batches with missing protein expression
-- is it actually the case that some of your "batches" have non-identical protein features?/home/kyle/anaconda3/envs/scvi-env/lib/python3.7/site-packages/scvi/core/distributions/_negative_binomial.py:532: UserWarning: The value argument must be within the support of the distribution UserWarning,
-- you shouldn't be getting this warning unless there is a value in your adata that is continuous valued or np.nan for example, maybe this has to do with point (1)?vae.trainer.history["elbo_test_set"]
nan?It's perplexing to me that you'd get np.nan after training because it should lead to nan loss which would stop training. 80% or 90% training size shouldn't make a difference for whether the method works or not.
Also @kyleferchen -- I'm happy to try it out myself if you send me the input anndata file.
I found my error! Just one of the cells in one of the datasets had some NaN values for protein expression, which I guess messed with everything else.
I just added a step to remove any cells that have missing values:
select_cells = pd.Series(adata.obsm._data['protein_expression'].index)[list(adata.obsm._data['protein_expression'].isna().sum(axis=1) == 0)] adata_unfiltered = adata adata = adata[select_cells,:]
Suprisingly, I still get the Found batches with missing protein expression
. I guess maybe some batches don't have any ADT counts for specific antibodies, but that didn't interfere with the training. It worked after removing NaN data values from the AnnData objects protein_expression table.
I guess maybe some batches don't have any ADT counts for specific antibodies, but that didn't interfere with the training. It worked after removing NaN data values from the AnnData objects protein_expression table.
We designed totalVI to handle missing proteins, but you should be sure this is what you want, and to take care if you look at the denoised protein expression values.
We are having problems when we combine 6 of our CITE-seq datasets. The workflow is fine when we just use 2 of the datasets combined, but when we use all 6 I can never get the training step to work.
At first, the function scvi.model.TOTALVI.train() would just crash every time I started it with all 6 datasets, which I thought was because the adata object was too big, and the function is likely allocating something our HPC job can't provide (>250GB). So I tried different parameters for the training, including reducing the number of epochs as well as adjusting the training set size to 0.8 instead of 0.9. After making those adjustments, the .train() function runs (taking about 3 days), but the resulting object only contains NaN values.
Have you ever experienced this issue with scaling the size of the input?
Do you think this could be an error with how one of the 6 CITE-seq datasets is structured?
What do you think would be the best approach for me to debug this?
Exited with exit code 1.
Resource usage summary:
The output (if any) follows:
[34mINFO [0m Using batches from adata.obs[1m[[0m[32m"Batch"[0m[1m][0m
[34mINFO [0m No label_key inputted, assuming all cells have same label
[34mINFO [0m Using data from adata.layers[1m[[0m[32m"counts"[0m[1m][0m
[34mINFO [0m Computing library size prior per batch
[34mINFO [0m Using protein expression from adata.obsm[1m[[0m[32m'protein_expression'[0m[1m][0m
[34mINFO [0m Using protein names from columns of adata.obsm[1m[[0m[32m'protein_expression'[0m[1m][0m
[34mINFO [0m Found batches with missing protein expression
[34mINFO [0m Successfully registered anndata object containing [1;34m63041[0m cells, [1;34m4000[0m
vars, [1;34m7[0m batches, [1;34m1[0m labels, and [1;34m195[0m proteins. Also registered [1;34m0[0m extra
categorical covariates and [1;34m0[0m extra continuous covariates.
[34mINFO [0m Please do not further modify adata until model is trained.
[34mINFO [0m Training for [1;34m200[0m epochs.
[34mINFO [0m KL warmup for [1;34m47280.75[0m iterations
Training...: 0%| | 0/200 [00:00<?, ?it/s] Training...: 0%| | 1/200 [10:13<33:55:12, 613.63s/it] Training...: 1%| | 2/200 [20:18<33:36:39, 611.11s/it] Training...: 2%|▏ | 3/200 [30:29<33:25:59, 610.96s/it] Training...: 2%|▏ | 4/200 [40:37<33:12:26, 609.93s/it] Training...: 2%|▎ | 5/200 [50:50<33:05:19, 610.87s/it] Training...: 3%|▎ | 6/200 [1:02:25<34:16:51, 636.14s/it] Training...: 4%|▎ | 7/200 [1:12:47<33:52:56, 632.00s/it] Training...: 4%|▍ | 8/200 [1:23:01<33:25:19, 626.66s/it] Training...: 4%|▍ | 9/200 [1:33:21<33:08:25, 624.64s/it] Training...: 5%|▌ | 10/200 [1:43:18<32:31:15, 616.18s/it] Training...: 6%|▌ | 11/200 [1:53:37<32:24:02, 617.16s/it] Training...: 6%|▌ | 12/200 [2:03:47<32:06:58, 614.99s/it] Training...: 6%|▋ | 13/200 [2:13:52<31:47:16, 611.96s/it] Training...: 7%|▋ | 14/200 [2:24:06<31:39:20, 612.69s/it] Training...: 8%|▊ | 15/200 [2:34:34<31:42:43, 617.10s/it] Training...: 8%|▊ | 16/200 [2:45:15<31:55:05, 624.48s/it] Training...: 8%|▊ | 17/200 [2:55:47<31:51:06, 626.59s/it] Training...: 9%|▉ | 18/200 [3:06:19<31:45:44, 628.27s/it] Training...: 10%|▉ | 19/200 [3:17:21<32:05:42, 638.36s/it] Training...: 10%|█ | 20/200 [3:28:34<32:26:34, 648.86s/it] Training...: 10%|█ | 21/200 [3:39:09<32:02:56, 644.56s/it] Training...: 11%|█ | 22/200 [3:50:31<32:26:06, 655.99s/it] Training...: 12%|█▏ | 23/200 [4:01:08<31:57:42, 650.07s/it] Training...: 12%|█▏ | 24/200 [4:12:57<32:38:48, 667.77s/it] Training...: 12%|█▎ | 25/200 [4:23:52<32:17:05, 664.15s/it] Training...: 13%|█▎ | 26/200 [4:36:40<33:35:40, 695.06s/it] Training...: 14%|█▎ | 27/200 [5:00:32<44:02:03, 916.32s/it] Training...: 14%|█▍ | 28/200 [5:25:30<52:07:00, 1090.82s/it] Training...: 14%|█▍ | 29/200 [5:37:11<46:15:28, 973.85s/it] Training...: 15%|█▌ | 30/200 [5:47:38<41:04:22, 869.78s/it] Training...: 16%|█▌ | 31/200 [5:58:11<37:29:33, 798.66s/it] Training...: 16%|█▌ | 32/200 [6:08:37<34:51:30, 746.97s/it] Training...: 16%|█▋ | 33/200 [6:19:02<32:56:48, 710.23s/it] Training...: 17%|█▋ | 34/200 [6:29:28<31:34:53, 684.90s/it] Training...: 18%|█▊ | 35/200 [6:39:53<30:34:46, 667.19s/it] Training...: 18%|█▊ | 36/200 [6:50:21<29:50:58, 655.24s/it] Training...: 18%|█▊ | 37/200 [7:00:42<29:12:31, 645.10s/it] Training...: 19%|█▉ | 38/200 [7:11:06<28:44:34, 638.73s/it] Training...: 20%|█▉ | 39/200 [7:21:37<28:27:43, 636.42s/it] Training...: 20%|██ | 40/200 [7:32:06<28:11:06, 634.17s/it] Training...: 20%|██ | 41/200 [7:42:30<27:52:33, 631.15s/it] Training...: 21%|██ | 42/200 [7:52:57<27:38:47, 629.92s/it] Training...: 22%|██▏ | 43/200 [8:03:23<27:25:00, 628.67s/it] Training...: 22%|██▏ | 44/200 [8:13:49<27:12:10, 627.76s/it] Training...: 22%|██▎ | 45/200 [8:24:18<27:02:47, 628.18s/it] Training...: 23%|██▎ | 46/200 [8:34:45<26:51:20, 627.80s/it] Training...: 24%|██▎ | 47/200 [8:45:10<26:38:53, 627.02s/it] Training...: 24%|██▍ | 48/200 [8:55:37<26:28:32, 627.06s/it] Training...: 24%|██▍ | 49/200 [9:06:07<26:20:14, 627.91s/it] Training...: 25%|██▌ | 50/200 [9:16:42<26:14:52, 629.95s/it] Training...: 26%|██▌ | 51/200 [9:27:12<26:04:51, 630.14s/it] Training...: 26%|██▌ | 52/200 [9:37:46<25:56:49, 631.15s/it] Training...: 26%|██▋ | 53/200 [9:48:17<25:46:19, 631.16s/it] Training...: 27%|██▋ | 54/200 [9:58:48<25:35:44, 631.13s/it] Training...: 28%|██▊ | 55/200 [10:09:13<25:21:00, 629.38s/it] Training...: 28%|██▊ | 56/200 [10:19:52<25:17:10, 632.16s/it] Training...: 28%|██▊ | 57/200 [10:30:15<24:59:56, 629.35s/it] Training...: 29%|██▉ | 58/200 [10:40:39<24:45:57, 627.87s/it] Training...: 30%|██▉ | 59/200 [10:51:07<24:35:46, 627.99s/it] Training...: 30%|███ | 60/200 [11:01:37<24:26:15, 628.40s/it] Training...: 30%|███ | 61/200 [11:12:06<24:16:33, 628.73s/it] Training...: 31%|███ | 62/200 [11:22:36<24:06:53, 629.08s/it] Training...: 32%|███▏ | 63/200 [11:33:41<24:21:00, 639.86s/it] Training...: 32%|███▏ | 64/200 [11:45:44<25:06:51, 664.79s/it] Training...: 32%|███▎ | 65/200 [11:58:08<25:49:17, 688.57s/it] Training...: 33%|███▎ | 66/200 [12:10:56<26:31:04, 712.42s/it] Training...: 34%|███▎ | 67/200 [12:34:16<33:56:14, 918.60s/it] Training...: 34%|███▍ | 68/200 [13:05:38<44:16:40, 1207.58s/it] Training...: 34%|███▍ | 69/200 [13:36:12<50:46:43, 1395.45s/it] Training...: 35%|███▌ | 70/200 [14:08:14<56:05:48, 1553.45s/it] Training...: 36%|███▌ | 71/200 [14:40:12<59:35:20, 1662.95s/it] Training...: 36%|███▌ | 72/200 [15:12:48<62:15:14, 1750.90s/it] Training...: 36%|███▋ | 73/200 [15:48:22<65:49:14, 1865.78s/it] Training...: 37%|███▋ | 74/200 [16:28:12<70:48:25, 2023.06s/it] Training...: 38%|███▊ | 75/200 [17:07:12<73:32:29, 2118.00s/it] Training...: 38%|███▊ | 76/200 [17:47:12<75:52:18, 2202.73s/it] Training...: 38%|███▊ | 77/200 [18:27:42<77:35:02, 2270.75s/it] Training...: 39%|███▉ | 78/200 [19:07:08<77:55:48, 2299.57s/it] Training...: 40%|███▉ | 79/200 [19:51:44<81:04:52, 2412.33s/it] Training...: 40%|████ | 80/200 [20:32:48<80:55:31, 2427.76s/it] Training...: 40%|████ | 81/200 [21:11:38<79:17:27, 2398.71s/it] Training...: 41%|████ | 82/200 [21:40:24<72:00:13, 2196.72s/it] Training...: 42%|████▏ | 83/200 [22:08:09<66:12:31, 2037.20s/it] Training...: 42%|████▏ | 84/200 [22:38:21<63:28:18, 1969.81s/it] Training...: 42%|████▎ | 85/200 [22:58:55<55:52:02, 1748.89s/it] Training...: 43%|████▎ | 86/200 [23:13:04<46:50:09, 1479.03s/it] Training...: 44%|████▎ | 87/200 [23:27:25<40:36:26, 1293.69s/it] Training...: 44%|████▍ | 88/200 [23:41:41<36:09:27, 1162.21s/it] Training...: 44%|████▍ | 89/200 [23:55:50<32:56:06, 1068.17s/it] Training...: 45%|████▌ | 90/200 [24:10:08<30:42:44, 1005.13s/it] Training...: 46%|████▌ | 91/200 [24:24:23<29:04:13, 960.13s/it] Training...: 46%|████▌ | 92/200 [24:38:41<27:53:06, 929.51s/it] Training...: 46%|████▋ | 93/200 [24:52:55<26:57:07, 906.80s/it] Training...: 47%|████▋ | 94/200 [25:06:56<26:07:28, 887.25s/it] Training...: 48%|████▊ | 95/200 [25:21:04<25:31:47, 875.31s/it] Training...: 48%|████▊ | 96/200 [25:35:11<25:02:49, 867.01s/it] Training...: 48%|████▊ | 97/200 [26:01:16<30:47:30, 1076.22s/it] Training...: 49%|████▉ | 98/200 [26:27:48<34:52:33, 1230.92s/it] Training...: 50%|████▉ | 99/200 [26:54:03<37:25:45, 1334.11s/it] Training...: 50%|█████ | 100/200 [27:20:37<39:13:35, 1412.15s/it] Training...: 50%|█████ | 101/200 [27:47:02<40:15:52, 1464.17s/it] Training...: 51%|█████ | 102/200 [28:13:29<40:51:19, 1500.81s/it] Training...: 52%|█████▏ | 103/200 [28:36:06<39:16:42, 1457.76s/it] Training...: 52%|█████▏ | 104/200 [28:50:10<33:57:46, 1273.61s/it] Training...: 52%|█████▎ | 105/200 [29:07:01<31:31:51, 1194.85s/it] Training...: 53%|█████▎ | 106/200 [29:23:51<29:44:56, 1139.33s/it] Training...: 54%|█████▎ | 107/200 [29:41:02<28:35:55, 1107.05s/it] Training...: 54%|█████▍ | 108/200 [29:58:01<27:36:43, 1080.47s/it] Training...: 55%|█████▍ | 109/200 [30:15:23<27:01:19, 1069.00s/it] Training...: 55%|█████▌ | 110/200 [30:33:03<26:39:14, 1066.17s/it] Training...: 56%|█████▌ | 111/200 [30:50:46<26:20:01, 1065.18s/it] Training...: 56%|█████▌ | 112/200 [31:08:27<26:00:34, 1064.03s/it] Training...: 56%|█████▋ | 113/200 [31:26:04<25:39:42, 1061.86s/it] Training...: 57%|█████▋ | 114/200 [31:43:50<25:23:58, 1063.24s/it] Training...: 57%|█████▊ | 115/200 [32:01:20<25:00:40, 1059.30s/it] Training...: 58%|█████▊ | 116/200 [32:15:45<23:21:13, 1000.88s/it] Training...: 58%|█████▊ | 117/200 [32:30:08<22:07:31, 959.66s/it] Training...: 59%|█████▉ | 118/200 [32:44:12<21:04:08, 924.98s/it] Training...: 60%|█████▉ | 119/200 [32:58:14<20:15:07, 900.10s/it] Training...: 60%|██████ | 120/200 [33:20:46<23:00:49, 1035.62s/it] Training...: 60%|██████ | 121/200 [33:47:25<26:25:59, 1204.56s/it] Training...: 61%|██████ | 122/200 [34:14:08<28:41:08, 1323.95s/it] Training...: 62%|██████▏ | 123/200 [34:40:59<30:09:36, 1410.08s/it] Training...: 62%|██████▏ | 124/200 [35:07:42<30:59:36, 1468.12s/it] Training...: 62%|██████▎ | 125/200 [35:34:23<31:25:03, 1508.05s/it] Training...: 63%|██████▎ | 126/200 [36:01:11<31:36:54, 1538.03s/it] Training...: 64%|██████▎ | 127/200 [36:26:49<31:11:04, 1537.87s/it] Training...: 64%|██████▍ | 128/200 [36:44:05<27:44:44, 1387.28s/it] Training...: 64%|██████▍ | 129/200 [37:01:26<25:18:40, 1283.39s/it] Training...: 65%|██████▌ | 130/200 [37:18:48<23:32:45, 1210.93s/it] Training...: 66%|██████▌ | 131/200 [37:36:13<22:15:24, 1161.22s/it] Training...: 66%|██████▌ | 132/200 [37:57:01<22:25:42, 1187.39s/it] Training...: 66%|██████▋ | 133/200 [38:14:02<21:10:08, 1137.44s/it] Training...: 67%|██████▋ | 134/200 [38:37:28<22:19:42, 1217.92s/it] Training...: 68%|██████▊ | 135/200 [38:55:02<21:06:03, 1168.68s/it] Training...: 68%|██████▊ | 136/200 [39:11:31<19:49:08, 1114.82s/it] Training...: 68%|██████▊ | 137/200 [39:27:16<18:37:14, 1064.04s/it] Training...: 69%|██████▉ | 138/200 [39:43:53<17:58:34, 1043.79s/it] Training...: 70%|██████▉ | 139/200 [40:00:27<17:26:06, 1028.97s/it] Training...: 70%|███████ | 140/200 [40:17:05<16:59:28, 1019.47s/it] Training...: 70%|███████ | 141/200 [40:33:45<16:36:50, 1013.74s/it] Training...: 71%|███████ | 142/200 [40:50:24<16:15:46, 1009.42s/it] Training...: 72%|███████▏ | 143/200 [41:07:01<15:55:12, 1005.49s/it] Training...: 72%|███████▏ | 144/200 [41:23:38<15:36:17, 1003.18s/it] Training...: 72%|███████▎ | 145/200 [41:36:00<14:07:37, 924.68s/it] Training...: 73%|███████▎ | 146/200 [41:45:31<12:16:51, 818.72s/it] Training...: 74%|███████▎ | 147/200 [41:55:46<11:09:14, 757.64s/it] Training...: 74%|███████▍ | 148/200 [42:06:10<10:21:41, 717.33s/it] Training...: 74%|███████▍ | 149/200 [42:16:26<9:44:00, 687.08s/it] Training...: 75%|███████▌ | 150/200 [42:26:42<9:14:41, 665.63s/it] Training...: 76%|███████▌ | 151/200 [42:36:59<8:51:49, 651.22s/it] Training...: 76%|███████▌ | 152/200 [42:47:12<8:31:40, 639.59s/it] Training...: 76%|███████▋ | 153/200 [42:57:28<8:15:27, 632.50s/it] Training...: 77%|███████▋ | 154/200 [43:07:37<7:59:38, 625.62s/it] Training...: 78%|███████▊ | 155/200 [43:17:54<7:47:13, 622.97s/it] Training...: 78%|███████▊ | 156/200 [43:28:13<7:35:57, 621.77s/it] Training...: 78%|███████▊ | 157/200 [43:38:29<7:24:25, 620.14s/it] Training...: 79%|███████▉ | 158/200 [43:48:46<7:13:16, 618.97s/it] Training...: 80%|███████▉ | 159/200 [43:59:04<7:02:51, 618.82s/it] Training...: 80%|████████ | 160/200 [44:12:12<7:26:19, 669.48s/it] Training...: 80%|████████ | 161/200 [44:25:37<7:41:36, 710.18s/it] Training...: 81%|████████ | 162/200 [44:39:06<7:48:33, 739.82s/it] Training...: 82%|████████▏ | 163/200 [44:49:21<7:13:12, 702.50s/it] Training...: 82%|████████▏ | 164/200 [44:59:34<6:45:20, 675.58s/it] Training...: 82%|████████▎ | 165/200 [45:09:49<6:23:22, 657.20s/it] Training...: 83%|████████▎ | 166/200 [45:20:08<6:06:02, 645.96s/it] Training...: 84%|████████▎ | 167/200 [45:30:28<5:50:52, 637.97s/it] Training...: 84%|████████▍ | 168/200 [45:40:44<5:36:49, 631.53s/it] Training...: 84%|████████▍ | 169/200 [45:51:02<5:24:13, 627.52s/it] Training...: 85%|████████▌ | 170/200 [46:01:19<5:12:12, 624.41s/it] Training...: 86%|████████▌ | 171/200 [46:11:32<5:00:03, 620.83s/it] Training...: 86%|████████▌ | 172/200 [46:21:37<4:47:28, 616.02s/it] Training...: 86%|████████▋ | 173/200 [46:31:49<4:36:45, 615.00s/it] Training...: 87%|████████▋ | 174/200 [46:42:05<4:26:35, 615.20s/it] Training...: 88%|████████▊ | 175/200 [46:52:22<4:16:32, 615.70s/it] Training...: 88%|████████▊ | 176/200 [47:02:39<4:06:26, 616.10s/it] Training...: 88%|████████▊ | 177/200 [47:12:57<3:56:26, 616.80s/it] Training...: 89%|████████▉ | 178/200 [47:23:17<3:46:28, 617.67s/it] Training...: 90%|████████▉ | 179/200 [47:33:35<3:36:13, 617.79s/it] Training...: 90%|█████████ | 180/200 [47:43:52<3:25:52, 617.64s/it] Training...: 90%|█████████ | 181/200 [47:54:12<3:15:46, 618.24s/it] Training...: 91%|█████████ | 182/200 [48:04:31<3:05:34, 618.56s/it] Training...: 92%|█████████▏| 183/200 [48:14:47<2:54:59, 617.62s/it] Training...: 92%|█████████▏| 184/200 [48:25:05<2:44:43, 617.71s/it] Training...: 92%|█████████▎| 185/200 [48:35:20<2:34:13, 616.93s/it] Training...: 93%|█████████▎| 186/200 [48:45:38<2:24:02, 617.31s/it] Training...: 94%|█████████▎| 187/200 [48:55:55<2:13:45, 617.36s/it] Training...: 94%|█████████▍| 188/200 [49:06:12<2:03:25, 617.10s/it] Training...: 94%|█████████▍| 189/200 [49:16:30<1:53:11, 617.39s/it] Training...: 95%|█████████▌| 190/200 [49:26:46<1:42:49, 616.95s/it] Training...: 96%|█████████▌| 191/200 [49:37:00<1:32:25, 616.19s/it] Training...: 96%|█████████▌| 192/200 [49:47:17<1:22:11, 616.38s/it] Training...: 96%|█████████▋| 193/200 [49:57:33<1:11:52, 616.08s/it] Training...: 97%|█████████▋| 194/200 [50:07:51<1:01:41, 616.89s/it] Training...: 98%|█████████▊| 195/200 [50:18:08<51:23, 616.72s/it]
Training...: 98%|█████████▊| 196/200 [50:28:26<41:09, 617.30s/it] Training...: 98%|█████████▊| 197/200 [50:38:44<30:52, 617.49s/it] Training...: 99%|█████████▉| 198/200 [50:49:03<20:35, 617.78s/it] Training...: 100%|█████████▉| 199/200 [50:59:20<10:17, 617.79s/it] Training...: 100%|██████████| 200/200 [51:09:39<00:00, 618.06s/it] Training...: 100%|██████████| 200/200 [51:09:39<00:00, 920.90s/it] [34mINFO [0m Training is still in warming up phase. If your applications rely on the posterior quality, consider training for more epochs or reducing the kl warmup.
[34mINFO [0m Training time: [1;34m161221[0m s. [35m/[0m [1;34m200[0m epochs