SamsungLabs / MTL

MIT License
71 stars 3 forks source link

Two errors about the code #1

Open Baijiong-Lin opened 1 year ago

Baijiong-Lin commented 1 year ago

Thanks for your wonderful work.

There are two errors when I execute the following command, python train.py --benchmark nyuv2_mtan --balancer amtlub --data-path /path/to/nyuv2


The first error is image

I comment the following line of code and this error can be solved, https://github.com/SamsungLabs/MTL/blob/3f577f1365101fdd16094313b4a3b0d3845ce10d/code/benchmarks/nyuv2.py#L128

And I think this line also should be commented, https://github.com/SamsungLabs/MTL/blob/3f577f1365101fdd16094313b4a3b0d3845ce10d/code/benchmarks/nyuv2.py#L88


The second error is image

hrepr is a list here when using mtan

https://github.com/SamsungLabs/MTL/blob/3f577f1365101fdd16094313b4a3b0d3845ce10d/code/optim/basic_balancer.py#L117-L147

The same problem in https://github.com/SamsungLabs/MTL/blob/3f577f1365101fdd16094313b4a3b0d3845ce10d/code/optim/aligned/balancer.py#L74

elaxEgan commented 8 months ago

Hello, I have also encountered these two errors. The "hrepr is a list" error seems to occur in the presence of amtlub and appears to be environment-related. Switching to a different device seems to resolve this issue. However, when I set --compute-cnumber to True, I experience GPU memory explosion during singular value decomposition calculations. Do you encounter this issue as well? I noticed that the gradient matrix is a 3 * 44117184 matrix. I would like to seek advice on the possible reasons for this. Thank you.

Baijiong-Lin commented 8 months ago

I suggest using our re-implementation https://github.com/median-research-group/LibMTL/blob/main/LibMTL/weighting/Aligned_MTL.py.

elaxEgan commented 8 months ago

Thank you for your response; it's very helpful to me.