Emmanuel-Naive / MATD3

Use Multi-agent Twin Delayed Deep Deterministic Policy Gradient(TD3) algorithm to find reasonable paths for ships
43 stars 4 forks source link