Open lucasosouza opened 5 years ago
also implement importance sampling to make it useful in off-policy methods without making it online which would not work well in experience buffer where experiences can be really old
also implement importance sampling to make it useful in off-policy methods without making it online which would not work well in experience buffer where experiences can be really old