Open 441YSK441 opened 3 weeks ago
Hi, when I run the grid_search with a mask I don't get nans in loss_grid, so I'd need a bit more information to reproduce the bug:
slicetca.block_mask
? One way I could see the loss be nan only when using a mask is if your train or test mask is False
for all entries.The pipeline schematized in Fig. 3. is roughly what is done in the notebook.
Thank you for the reply.
One thing I notice is that after running "slicetca.grid_search" in my environment, "reconstructed_noisy_tensor" is changed to the matrices containing only zero and "train_mask" and "test_mask" are changed to the matrix containing only False. (before running "slicetca.grid_search", the value of these matrices are normal.)
Another thing is that the number of true and false in train_mask and test_mask is different between my environment and Colab environment (as shown below). Do you have any idea about the cause of these problem?
The number of true in train_mask: 6024557 The number of false in train_mask: 1209943 The number of true in test_mask: 672370 The number of false in test_mask: 6562130
Colab: The number of true in train_mask: 6027016 Colab: The number of false in train_mask: 1207484 Colab: The number of true in test_mask: 672298 Colab: The number of false in test_mask: 6562202
I tried running the example notebook with Python 3.10.12 and torch 2.3.0 but I still can't reproduce the issue.
Perhaps you could try to run this to check this is not an issue with the notebook:
device = ('cuda' if torch.cuda.is_available() else 'cpu')
T = torch.randn((10, 10, 10), device=device)
mask_train, mask_test = slicetca.block_mask(list(T.shape), [1, 0, 1], [1, 0, 0], fraction_test=0.1, device=device)
loss_grid, seed_grid = slicetca.grid_search(T, mask_train=mask_train, mask_test=mask_test, min_ranks=[0, 0, 0], max_ranks=[1, 0, 1], max_iter=2)
print(loss_grid)
I indeed get a non-nan loss_grid. The test_mask doesn't get modified. Note that to check the proportion of masked entries you can do print(test_mask.float().mean())
Regarding the number of masked entries, I believe this is just a difference in the RNG seeds.
When I run the given code once in my setup, loss_grid was nan. However when I run twice, I could get non-nan loss_grid. In addition, when I run on Mac environment (I used to run the codes on Windows), I could get non-nan loss_grid by any code.
I'm sorry for the ambiguous comments. It's just case reports. I totally don't know what the cause of the problem, but I could get values by using Mac. Maybe I will use Mac to calculate loss_grid. Thank you.
Hi. Thank you for developing the great tool for analysis.
When I run the below code in sliceTCA_notebook_1.ipynb, the value of loss_grid was "nan" while If I deleted "mask_train" and "mask_test", loss_grid returned something. Do you know how to solve this problem? I did not modify any part except "sample_size".
Another requirement Could you share an additional code which describe the flow of analysis in figure 3 of Pellegrino et al paper?