EmmaRocheteau / eICU-GNN-LSTM

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).
MIT License
99 stars 29 forks source link

RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors) .... #3

Open Al-Dailami opened 2 years ago

Al-Dailami commented 2 years ago

Hello

First, I would like to thank you for sharing the code of your awesome projects. I am trying to run your code and reproduce your experiments. Currently, I'm facing a problem. Here are the errors and my fixes:

[0]

File ".../eICU-GNN-LSTM/graph_construction/create_bert_graph.py", line 19, in make_graph_bert distances = torch.cdist(batch, bert, p=2.0, compute_mode='use_mm_for_euclid_dist_if_necessary')

RuntimeError: cdist only supports floating-point dtypes, X1 got: Byte

Fix: changed dtype from ByteTensor to FloatTensor File ".../eICU-GNN-LSTM/graph_construction/create_graph.py", line 15 dtype = torch.cuda.sparse.FloatTensor if device.type == 'cuda' else torch.sparse.FloatTensor https://github.com/EmmaRocheteau/eICU-GNN-LSTM/blob/5167eea88bfe7a3146ccb6194f54e8e57f52128b/graph_construction/create_graph.py#L15

File "/home/sale/eICU-GNN-LSTM/graph_construction/create_graph.py", line 65, in make_graph_penalise s_pen = 5 * s - total_combined_diags # the 5 is fairly arbitrary but I don't want to penalise not sharing diagnoses too much

RuntimeError: The size of tensor a (89123) must match the size of tensor b (1000) at non-singleton dimension 1

Fix: File ".../eICU-GNN-LSTM/graph_construction/create_graph.py", line 194

u, v, vals, k = make_graph_penalise(all_diagnoses, scores, debug=False, k=args.k) ############### debug=False Fixes problem https://github.com/EmmaRocheteau/eICU-GNN-LSTM/blob/5167eea88bfe7a3146ccb6194f54e8e57f52128b/graph_construction/create_graph.py#L194

[1]

File "../projects/eICU-GNN-LSTM/src/models/pyg_ns.py", line 241, in inference edge_attn = torch.cat(edge_attn, dim=0) # [no. of edges, n_heads of that layer]

RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors. Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

Fix: if i == 1 and get_attn: edge_index_w_self_loops = torch.cat(edge_index_w_self_loops, dim=1) # [2, n. of edges] if get_attn: edge_attn = torch.cat(edge_attn, dim=0) # [no. of edges, n_heads of that layer] all_edge_attn.append(edge_attn) https://github.com/EmmaRocheteau/eICU-GNN-LSTM/blob/5167eea88bfe7a3146ccb6194f54e8e57f52128b/src/models/pyg_ns.py#L241

[2]

File "../eICU-GNN-LSTM/train_ns_lstmgnn.py", line 94, in validation_step out = out[self.dataset.data.val_mask] TypeError: only integer tensors of a single element can be converted to an index

Fix: out = out[0][self.dataset.data.val_mask] https://github.com/EmmaRocheteau/eICU-GNN-LSTM/blob/5167eea88bfe7a3146ccb6194f54e8e57f52128b/train_ns_lstmgnn.py#L94

[3]

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

Fix: In the same file "../eICU-GNN-LSTM/train_ns_lstmgnn.py" line 96 Added the following lines: out[out != out] = 0 out_lstm[out_lstm != out_lstm] = 0

https://github.com/EmmaRocheteau/eICU-GNN-LSTM/blob/5167eea88bfe7a3146ccb6194f54e8e57f52128b/train_ns_lstmgnn.py#L94 because when I print those matrices found some NaN values.

After this, the code starts training. BUT with wired training progress (loss always nan) !!! Print out the output matrices found that it's always NANs !!!

acc: 0.9049 prec0: 0.9049 prec1: nan rec0: 1.0000 rec1: 0.0000 auroc: 0.5000 auprc: 0.5476 minpse: 0.0951 f1macro: 0.4750 Epoch 1: 92%|█████████████████████████████████████████████████████████████████████████████████▎ | 452/489 [00:35<00:02, 12.78it/s, loss=nan, v_num=83]

I tried to trace the source of the error and the NaNs come after lstm layer this line:

https://github.com/EmmaRocheteau/eICU-GNN-LSTM/blob/5167eea88bfe7a3146ccb6194f54e8e57f52128b/src/models/lstm.py#L39

Please correct me if I'm wrong ... Thanks a lot in advance...

Note: I have used the same version of packages listed on the requirements.txt file

EmmaRocheteau commented 2 years ago

Hello! Thanks for the detailed comment. I’ll work through the code and tell you what outputs I should get. I won’t have time this weekend though, so I’ll get back to you after that!

Al-Dailami commented 2 years ago

Thanks for your reply,

One more error, this is the first one I got.

File ".../eICU-GNN-LSTM/src/dataloader/convert.py", line 75, in convert_into_mmap write_file[n : n+arr_len, :] = arr # write into mmap ValueError: could not broadcast input array from shape (62385,92) into shape (62385,57)

https://github.com/EmmaRocheteau/eICU-GNN-LSTM/blob/5167eea88bfe7a3146ccb6194f54e8e57f52128b/src/dataloader/convert.py#L55

These numbers are different from the ones I got after running the preprocessing code. diagnosis= 356, labels=5, and flat=93

Fixed by: df = pd.read_csv(csv_path)

n_cols = (df.shape[1] -1) if n_cols is None else n_cols

https://github.com/EmmaRocheteau/eICU-GNN-LSTM/blob/5167eea88bfe7a3146ccb6194f54e8e57f52128b/src/dataloader/convert.py#L56

EmmaRocheteau commented 1 year ago

Thank you for sharing! I will be looking into this soon. Apologies for the incredibly long delay in getting back and I appreciate you sharing the solution here