Invalid action and returns-to-go input

Smart-Trafficlab / TransformerLight

TransformerLight: A Novel Sequence Modeling Based Traffic Signaling Mechanism via Gated Transformer (29th ACM SIGKDD)

21 stars 2 forks source link

Invalid action and returns-to-go input #5

Open AlexBrians opened 10 months ago

AlexBrians commented 10 months ago

At ./models/general_model_transforlight.py and ./models/general_model_DT.py

The action and returns-to-go are set as zero matrixes (in line 70 and 71), which are not valid for DT.

AlexBrians commented 10 months ago

In the model training process, I noticed that rewards are utilized directly instead of returns-to-go as mentioned (in line 123).

Additionally, there appears to be an inconsistency regarding timesteps（(in line 122). They should represent the timestep for each individual trajectory rather than collectively for a single batch.