uzh-rpg / flightmare

An Open Flexible Quadrotor Simulator
https://uzh-rpg.github.io/flightmare/
Other
965 stars 340 forks source link

Forest Rendering Issue #95

Open EceChaik opened 3 years ago

EceChaik commented 3 years ago

Hi guys, I am just getting started with the flightmare package, and I am testing the pre-trained RL examples. After going through the wiki and also downloading the latest release for the Unity rendering, the forest scene is not rendering correctly. The example still runs and the drone hovers appropriately, but while all the other scenes show appropriately, this is what the forest environment looks like: Screenshot from 2021-03-22 13-32-55

Is this a known issue?

yun-long commented 3 years ago

this rendering is correct. I guess the current drone position is very close to (x, y, z) = (0, 0, 5). If you set the drone position to (x, y, z) = (101, 86, 31). You will have significantly different views.

yun-long commented 3 years ago

To do so, modify the flightlib/src/envs/quadrotor_env/quadrotor_env.cpp

At the moment, the goal state is (0, 0, 5). change it to (101, 86, 31). Also adapt the world_box. e.g., worldbox << -200, 200, -200, 200, -200, 200;

And change the reset // randomly reset the quadrotor state // reset position quadstate.x(QS::POSX) = uniformdist(randomgen) + 101; quadstate.x(QS::POSY) = uniformdist(randomgen) + 86; quadstate.x(QS::POSZ) = uniformdist(randomgen) + 5;

pip install . retrain your policy. and now the drone will fly to the new goal position.

Note, the RL policy is trained for hovering at one specific goal position. Every time you change the goal position, you would have to retrain it. Nevertheless, I guess it serves as a good start for more complex tasks.

EceChaik commented 3 years ago

Thanks, you were right about the position being bad for the forest environment. I did the proposed changes but after training the drone just keeps falling. Could it be that some RL parameters need to be adjusted as well, for the new position? I wouldn't ask for something like this normally, cause it's clearly not a package issue at this point, I just want to make sure I'm following the correct procedure. This is the final output after running run_drone_control.py with the new goal state: training

and these are the resulting flights. results

yun-long commented 3 years ago

no worries. Could you share your code here, I mean the quadrotor_env.cpp

EceChaik commented 3 years ago

This is my modified quadrotor_env.cpp file:

`#include "flightlib/envs/quadrotor_env/quadrotor_env.hpp"

namespace flightlib {

QuadrotorEnv::QuadrotorEnv() : QuadrotorEnv(getenv("FLIGHTMARE_PATH") + std::string("/flightlib/configs/quadrotor_env.yaml")) {}

QuadrotorEnv::QuadrotorEnv(const std::string &cfg_path) : EnvBase(), poscoeff(0.0), oricoeff(0.0), lin_velcoeff(0.0), ang_velcoeff(0.0), actcoeff(0.0), goalstate((Vector() << 101.0, 86.0, 31.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0) .finished()) { // load configuration file YAML::Node cfg_ = YAML::LoadFile(cfg_path);

quadrotorptr = std::makeshared(); // update dynamics QuadrotorDynamics dynamics; dynamics.updateParams(cfg); quadrotorptr->updateDynamics(dynamics);

// define a bounding box worldbox << -200, 200, -200, 200, -200, 200; if (!quadrotorptr->setWorldBox(worldbox)) { logger_.error("cannot set wolrd box"); };

// define input and output dimension for the environment obsdim = quadenv::kNObs; actdim = quadenv::kNAct;

Scalar mass = quadrotorptr->getMass(); actmean = Vector::Ones() (-mass Gz) / 4; actstd = Vector::Ones() (-mass 2 * Gz) / 4;

// load parameters loadParam(cfg_); }

QuadrotorEnv::~QuadrotorEnv() {}

bool QuadrotorEnv::reset(Ref<Vector<>> obs, const bool random) { quadstate.setZero(); quadact.setZero();

if (random) { // randomly reset the quadrotor state // reset position quadstate.x(QS::POSX) = uniformdist(randomgen) + 101; quadstate.x(QS::POSY) = uniformdist(randomgen) + 86; quadstate.x(QS::POSZ) = uniformdist(randomgen) + 31; if (quadstate.x(QS::POSZ) < -0.0) quadstate.x(QS::POSZ) = -quadstate.x(QS::POSZ); // reset linear velocity quadstate.x(QS::VELX) = uniformdist(randomgen); quadstate.x(QS::VELY) = uniformdist(randomgen); quadstate.x(QS::VELZ) = uniformdist(randomgen); // reset orientation quadstate.x(QS::ATTW) = uniformdist(randomgen); quadstate.x(QS::ATTX) = uniformdist(randomgen); quadstate.x(QS::ATTY) = uniformdist(randomgen); quadstate.x(QS::ATTZ) = uniformdist(randomgen); quadstate.qx /= quadstate.qx.norm(); } // reset quadrotor with random states quadrotorptr->reset(quadstate);

// reset control command cmd.t = 0.0; cmd.thrusts.setZero();

// obtain observations getObs(obs); return true; }

bool QuadrotorEnv::getObs(Ref<Vector<>> obs) { quadrotorptr->getState(&quadstate);

// convert quaternion to euler angle Vector<3> euler_zyx = quadstate.q().toRotationMatrix().eulerAngles(2, 1, 0); // quaternionToEuler(quadstate.q(), euler); quadobs << quadstate.p, euler_zyx, quadstate.v, quadstate.w;

obs.segment(quadenv::kObs) = quadobs; return true; }

Scalar QuadrotorEnv::step(const Ref<Vector<>> act, Ref<Vector<>> obs) { quadact = act.cwiseProduct(actstd) + actmean; cmd_.t += simdt; cmd_.thrusts = quadact;

// simulate quadrotor quadrotorptr->run(cmd_, simdt);

// update observations getObs(obs);

Matrix<3, 3> rot = quadstate.q().toRotationMatrix();

// ---------------------- reward function design // - position tracking Scalar pos_reward = poscoeff (quadobs.segment(quadenv::kPos) - goalstate.segment(quadenv::kPos)) .squaredNorm(); // - orientation tracking Scalar ori_reward = oricoeff (quadobs.segment(quadenv::kOri) - goalstate.segment(quadenv::kOri)) .squaredNorm(); // - linear velocity tracking Scalar lin_vel_reward = lin_velcoeff (quadobs.segment(quadenv::kLinVel) - goalstate.segment(quadenv::kLinVel)) .squaredNorm(); // - angular velocity tracking Scalar ang_vel_reward = ang_velcoeff (quadobs.segment(quadenv::kAngVel) - goalstate.segment(quadenv::kAngVel)) .squaredNorm();

// - control action penalty Scalar act_reward = actcoeff * act.cast().norm();

Scalar total_reward = pos_reward + ori_reward + lin_vel_reward + ang_vel_reward + act_reward;

// survival reward total_reward += 0.1;

return total_reward; }

bool QuadrotorEnv::isTerminalState(Scalar &reward) { if (quadstate.x(QS::POSZ) <= 0.02) { reward = -0.02; return true; } reward = 0.0; return false; }

bool QuadrotorEnv::loadParam(const YAML::Node &cfg) { if (cfg["quadrotor_env"]) { simdt = cfg["quadrotor_env"]["sim_dt"].as(); maxt = cfg["quadrotor_env"]["max_t"].as(); } else { return false; }

if (cfg["rl"]) { // load reinforcement learning related parameters poscoeff = cfg["rl"]["pos_coeff"].as(); oricoeff = cfg["rl"]["ori_coeff"].as(); lin_velcoeff = cfg["rl"]["lin_vel_coeff"].as(); ang_velcoeff = cfg["rl"]["ang_vel_coeff"].as(); actcoeff = cfg["rl"]["act_coeff"].as(); } else { return false; } return true; }

bool QuadrotorEnv::getAct(Ref<Vector<>> act) const { if (cmd_.t >= 0.0 && quadact.allFinite()) { act = quadact; return true; } return false; }

bool QuadrotorEnv::getAct(Command const cmd) const { if (!cmd_.valid()) return false; cmd = cmd_; return true; }

void QuadrotorEnv::addObjectsToUnity(std::shared_ptr bridge) { bridge->addQuadrotor(quadrotorptr); }

std::ostream &operator<<(std::ostream &os, const QuadrotorEnv &quad_env) { os.precision(3); os << "Quadrotor Environment:\n" << "obs dim = [" << quad_env.obsdim << "]\n" << "act dim = [" << quad_env.actdim << "]\n" << "sim dt = [" << quad_env.simdt << "]\n" << "max_t = [" << quad_env.maxt << "]\n" << "act_mean = [" << quad_env.actmean.transpose() << "]\n" << "act_std = [" << quad_env.actstd.transpose() << "]\n" << "obs_mean = [" << quad_env.obsmean.transpose() << "]\n" << "obs_std = [" << quad_env.obsstd.transpose() << std::endl; os.precision(); return os; }

} // namespace flightlib`

EceChaik commented 3 years ago

Have you perhaps spotted any mistakes?