The dimension of `self._state["nn_input"]["nod_fea"]` in the `EGSEnv` class

zark721 commented 6 months ago

Dear author, I have a question regarding the vector dimension of self._state["nn_input"]["nod_fea"] in the EGSEnv class (in dag_sched_env.py).

I personally believe that self._state["nn_input"]["nod_fea"] represents the observation information, which is directly fed into the actor network for action calculation based on it.

With this understanding, I have the following question:

From the code in EGSEnv, it seems that the dimension of the node features generated by the environment (self._state["nn_input"]["nod_fea"]) is 4. It makes sense because each node has four kinds of information: wcet, eft, lst, and lateral_width. Therefore, I assume that the observation is a two-dimensional matrix with a size n*4 (where n is the number of nodes). After the process of actor encoder layers, the output should also be n*4. However, from the hyperparameters in arxiv(at TABLE IV:PPO hyper-parameters.), the Node embedding dimension is 64. It conflicts with the dimension of self._state["nn_input"]["nod_fea"]. Did I get something wrong? The dimension of observation generated by the environment is n*64 or n*4?

Thank you once again for your support.

binqi-sun commented 6 months ago

Hi, node features are mapped to a node feature embedding through an embedding layer, and the node embedding dimension is 64. I hope this answers your question.

zark721 commented 6 months ago

Is the embedding layer you mentioned equivalent to the Encoder Layer here?

zark721 commented 6 months ago

Or does the embedding layer locate before the actor and critic encoder? Just as in the illustration below? If it is, is it a Linear layer or an MLP?

binqi-sun commented 6 months ago

Hi, it's an embedding layer before the encoder, as used in many transformer-based architectures. In our case, we use a Dense layer for node feature embeddings and an Embedding layer for mask embeddings.

zark721 commented 6 months ago

Thanks again, but I'm not sure about the following details.

The Actor and Critic each have their own Dense Layer as the node embedding layer, right? Should the Dense Layer add an activation function?
The mask embedding you mentioned is for the spatial encoding in Multi-Head Attention, is that correct?
What is the purpose of self._state["nn_input"]["pad_msk"]?
In class EGSEnv, there are operations like padding(self._dag.trans_closure, self.max_n_nodes). However, the value of self.max_n_nodes is the same as the dimension of self._dag.trans_closure, so padding here will not have any effect. I would like to know the purpose of the padding operation and setting the variable self.max_n_nodes, it seems like they don't play any role over here.

I am a student studying machine learning and I have read transformer and graphormer articles and some code. If you think I lack of other basic knowledge, please directly throw the related link or keywords to me to save your time, thank you very much.

binqi-sun / egs

The dimension of `self._state["nn_input"]["nod_fea"]` in the `EGSEnv` class #3