Open Baijiong-Lin opened 1 year ago
Hello, I have also encountered these two errors. The "hrepr is a list" error seems to occur in the presence of amtlub and appears to be environment-related. Switching to a different device seems to resolve this issue. However, when I set --compute-cnumber to True, I experience GPU memory explosion during singular value decomposition calculations. Do you encounter this issue as well? I noticed that the gradient matrix is a 3 * 44117184 matrix. I would like to seek advice on the possible reasons for this. Thank you.
I suggest using our re-implementation https://github.com/median-research-group/LibMTL/blob/main/LibMTL/weighting/Aligned_MTL.py.
Thank you for your response; it's very helpful to me.
Thanks for your wonderful work.
There are two errors when I execute the following command,
python train.py --benchmark nyuv2_mtan --balancer amtlub --data-path /path/to/nyuv2
The first error is
I comment the following line of code and this error can be solved, https://github.com/SamsungLabs/MTL/blob/3f577f1365101fdd16094313b4a3b0d3845ce10d/code/benchmarks/nyuv2.py#L128
And I think this line also should be commented, https://github.com/SamsungLabs/MTL/blob/3f577f1365101fdd16094313b4a3b0d3845ce10d/code/benchmarks/nyuv2.py#L88
The second error is
hrepr
is a list here when usingmtan
https://github.com/SamsungLabs/MTL/blob/3f577f1365101fdd16094313b4a3b0d3845ce10d/code/optim/basic_balancer.py#L117-L147
The same problem in https://github.com/SamsungLabs/MTL/blob/3f577f1365101fdd16094313b4a3b0d3845ce10d/code/optim/aligned/balancer.py#L74