BlueAndi / RadonUlzer

Line follower, platooning, sensor fusion with DroidControlSystem and etc.
MIT License
5 stars 2 forks source link

Reinforced Learning Architecture #156

Open gabryelreyes opened 1 month ago

gabryelreyes commented 1 month ago

Optimization of the architecture. Special focus on the separation of the agent and the environment

hoeftjch commented 1 month ago

Suggestion for general architecture improvements:

Possible performance improvements:

def create_dataset_from_buffer(buffer): states = tf.convert_to_tensor(buffer.states, dtype=tf.float32) actions = tf.convert_to_tensor(buffer.actions, dtype=tf.float32) rewards = tf.convert_to_tensor(buffer.rewards, dtype=tf.float32) next_states = tf.convert_to_tensor(buffer.next_states, dtype=tf.float32) dones = tf.convert_to_tensor(buffer.dones, dtype=tf.float32) advantages = tf.convert_to_tensor(buffer.advantages, dtype=tf.float32)

dataset = create_dataset(states, actions, rewards, next_states, dones, advantages)
return dataset

- apply the @tf.function decorators to function such as `predict_action` and `learn`