Nice work! Thank you very much for your contribution to the AI safety community!
I noticed a weird phenomenon when training the topology generator. The code of training the topology generator is
toponet.train()
for _ in tqdm(range(args.gtn_epochs), desc="training topology generator"):
optimizer_topo.zero_grad()
# generate new adj_list by dr.data['adj_list']
for gid in pset:
SendtoCUDA(gid, [init_As, Ainputs, topomasks]) # only send the used graph items to cuda
rst_bkdA = toponet(
Ainputs[gid], topomasks[gid], topo_thrd, cuda, args.topo_activation, 'topo')
# rst_bkdA = recover_mask(nodenums[gid], topomasks[gid], 'topo')
# bkd_dr.data['adj_list'][gid] = torch.add(rst_bkdA, init_As[gid])
bkd_dr.data['adj_list'][gid] = torch.add(rst_bkdA[:nodenums[gid], :nodenums[gid]], init_As[gid]) # only current position in cuda
SendtoCPU(gid, [init_As, Ainputs, topomasks])
loss = forwarding(args, bkd_dr, model, allset, criterion)
loss.backward()
optimizer_topo.step()
torch.cuda.empty_cache()
toponet.eval()
When I check the parameters of the topology generator before and after the training using the following snippets, i.e.,
import copy
old_toponet = copy.deepcopy(toponet)
toponet.train()
for _ in tqdm(range(args.gtn_epochs), desc="training topology generator"):
optimizer_topo.zero_grad()
# generate new adj_list by dr.data['adj_list']
for gid in pset:
SendtoCUDA(gid, [init_As, Ainputs, topomasks]) # only send the used graph items to cuda
rst_bkdA = toponet(
Ainputs[gid], topomasks[gid], topo_thrd, cuda, args.topo_activation, 'topo')
# rst_bkdA = recover_mask(nodenums[gid], topomasks[gid], 'topo')
# bkd_dr.data['adj_list'][gid] = torch.add(rst_bkdA, init_As[gid])
bkd_dr.data['adj_list'][gid] = torch.add(rst_bkdA[:nodenums[gid], :nodenums[gid]], init_As[gid]) # only current position in cuda
SendtoCPU(gid, [init_As, Ainputs, topomasks])
loss = forwarding(args, bkd_dr, model, allset, criterion)
loss.backward()
optimizer_topo.step()
torch.cuda.empty_cache()
toponet.eval()
new_toponet = copy.deepcopy(toponet)
old_state_dict = old_toponet.state_dict()
new_state_dict = new_toponet.state_dict()
for name in old_state_dict:
param_diff = new_state_dict[name] - old_state_dict[name]
print(torch.mean(param_diff))
I found there is no difference in parameters after training. The log is as follows:
Nice work! Thank you very much for your contribution to the AI safety community!
I noticed a weird phenomenon when training the topology generator. The code of training the topology generator is
When I check the parameters of the topology generator before and after the training using the following snippets, i.e.,
I found there is no difference in parameters after training. The log is as follows:
Could you give me some suggestions about this problem? Thank you very much for any replies! :)