First of all, thank you for your contribution to this project.
Hello, I would like to ask what is the hidden_size used for each dataset in Table 1 in the paper? According to the parameter formula you provided, I can infer that the hidden_size of the MNIST dataset is 8, but I have not calculated the hidden_size of the other two datasets. I need your help.
First of all, thank you for your contribution to this project. Hello, I would like to ask what is the hidden_size used for each dataset in Table 1 in the paper? According to the parameter formula you provided, I can infer that the hidden_size of the MNIST dataset is 8, but I have not calculated the hidden_size of the other two datasets. I need your help.