Open ntoxeg opened 3 months ago
Ok, it seems that implementing indexing for dimers fixed this — I’m not entirely sure I did this right but works so far, sorry for the confusion.
No worries! Please don't hesitate to raise more issues. I'm a little swamped at the moment to answer promptly but will try to answer in reasonable time.
Sadly, the issue is back and I don’t know why — it seems it is not deterministic, perhaps there is an issue with running this on CUDA 12.1 (I had to upgrade as I run this on H100, it wouldn’t work with the old libraries this project used)?
I get this issue when there is an invalid rotation. Do you have any proteins with empty residues (i.e. mask is all 0)? Are you initializing rotations with the identity? You can also put a try except where the error is happening and print out what the bad example is as well as inspect the tensors.
In a bad example I checked the res mask is actually all 1 it seems, but trans_sc
is all NaN, the resultant model output is also all NaN.
Like the title suggests, I’ve managed to get a run going but it crashes with the following traceback