Closed bossdm closed 4 years ago
See if this thread helps: https://www.argos-sim.info/forum/viewtopic.php?t=178 It seems we have to split the arena into multiple parts, and keep track of robots and objects in each part (see example https://github.com/ilpincy/argos3-examples/blob/master/experiments/multiple_engines.argos).
The thread also has some tweaks to speed up the argos simulation.
I also came across the MPGA example in https://www.argos-sim.info/examples.php (see last example). It seems interesting, but would take some work to redesign the argos-sferes interface.
Kind regards, Danesh.
On Thu, Jun 27, 2019 at 3:39 AM bossdm notifications@github.com wrote:
I have run code with multiple physics engines now successfully and this greatly speeds up the code.
However, the manual setting of the robots is causing some issue:
./bin/behaviour_evolcvt10D` experiments/Gomes_walls_and_robots_std.argos [INFO] Using 4 parallel threads [INFO] Chosen method "balance_quantity": threads will be assigned the same [INFO] number of tasks, independently of the task length. [INFO] Using random seed = 1 [INFO] Using simulation clock tick = 0.1 [INFO] Total experiment length in clock ticks = 1200 [INFO] Loaded library "./lib/libnn_controller.so" [INFO] Loaded library "./lib/libevolution_loopfunctionscvt10D.so" [INFO] The physics engine "dyn2d_0" will perform 10 iterations per tick (dt = 0.01 sec) [INFO] The physics engine "dyn2d_1" will perform 10 iterations per tick (dt = 0.01 sec) [INFO] The physics engine "dyn2d_2" will perform 10 iterations per tick (dt = 0.01 sec) [INFO] The physics engine "dyn2d_3" will perform 10 iterations per tick (dt = 0.01 sec) [FATAL] Dynamics2D model id "thymio1" not found in dynamics 2D engine "dyn2d_0"
The error can be avoided by commenting line 292 in the base_loop_functions.cpp which calls the reset_agent_positions(). However then you would need some other means of distribtuing the agents per trial; I have noticed that the 'distribute' in the argos-config is not doing anything for random resets, even with random seed 0, if you look at the positions of robots or any cylinders. According to the code on https://github.com/ilpincy/argos3/blob/f853c1a324b96ad5539e6de7f325a381c9bc45cb/src/core/simulator/space/space.h https://github.com/ilpincy/argos3/blob/f853c1a324b96ad5539e6de7f325a381c9bc45cb/src/core/simulator/space/space.h, it seems that the distribute is only used at the Init.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/resilient-swarms/argos-sferes/issues/14?email_source=notifications&email_token=ACAO4KINWF22GXEW4HS6YODP4QR7JA5CNFSM4H3XRRX2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G36PR6Q, or mute the thread https://github.com/notifications/unsubscribe-auth/ACAO4KOVJZBVJOJFPIFYWK3P4QR7JANCNFSM4H3XRRXQ .
Thanks,
I had used this forum thread; my configuration is based on that example. I also am using the Os flag as recommended there.
"keep track of robots and objects in each part" : putting the configuration does not appear to be enough. And if I look to the footbot_diffusion.cpp it seems nothing special beyond the configuration is needed. I think that manually placing the robot makes it that the robot corresponding to one physics engine now is another region corresponding to another engine. I will have a look at some argos3 code.
I will also have a close look at the MPGA code and example.
On Thu, Jun 27, 2019 at 11:04 AM bossdm notifications@github.com wrote:
Thanks,
I had used this forum thread; my configuration is based on that example. I also am using the Os flag as recommended there.
They also had optimization recommendations for the RAB sensors and LEDs too. Please take a look at it.
"keep track of robots and objects in each part" : putting the configuration does not appear to be enough. And if I look to the footbot_diffusion.cpp it seems nothing special beyond the configuration is needed. I think that manually placing the robot makes it that the robot corresponding to one physics engine now is another region corresponding to another engine. I will have a look at some argos3 code.
Okay.
I will also have a close look at the MPGA code and example.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/resilient-swarms/argos-sferes/issues/14?email_source=notifications&email_token=ACAO4KLPIDE4HQHHE6HXVY3P4SGBXA5CNFSM4H3XRRX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYWUCNQ#issuecomment-506282294, or mute the thread https://github.com/notifications/unsubscribe-auth/ACAO4KITZX4IFFBOHJ5QHO3P4SGBXANCNFSM4H3XRRXQ .
On Thu, Jun 27, 2019 at 11:04 AM bossdm notifications@github.com wrote:
Thanks,
I had used this forum thread; my configuration is based on that example. I also am using the Os flag as recommended there.
"keep track of robots and objects in each part" : putting the configuration does not appear to be enough. And if I look to the footbot_diffusion.cpp it seems nothing special beyond the configuration is needed. I think that manually placing the robot makes it that the robot corresponding to one physics engine now is another region corresponding to another engine. I will have a look at some argos3 code.
I will also have a close look at the MPGA code and example.
If we can see how the MPGA example works, we could integrate it with the eval::Parallel functionality of sferes to run parallel argos simulation threads. What do you think?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/resilient-swarms/argos-sferes/issues/14?email_source=notifications&email_token=ACAO4KLPIDE4HQHHE6HXVY3P4SGBXA5CNFSM4H3XRRX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYWUCNQ#issuecomment-506282294, or mute the thread https://github.com/notifications/unsubscribe-auth/ACAO4KITZX4IFFBOHJ5QHO3P4SGBXANCNFSM4H3XRRXQ .
If we can see how the MPGA example works, we could integrate it with the eval::Parallel functionality of sferes to run parallel argos simulation threads. What do you think?
For now all I can say that it sounds a good idea and likely a tad more efficient than doing a single trial at once and then parallelising within the trial, but also that it would likely takes some work. I will have a look and see how difficult that is. I think it is probably less risky in terms of bugs as the multiple physics engines may be affecting some code that we wrote if we don't fully understand how that works.
For the rab sensors I have added the line
For the LED what would the range be ? Even though we are not using it in the experiments
On Thu, Jun 27, 2019 at 11:38 AM bossdm notifications@github.com wrote:
If we can see how the MPGA example works, we could integrate it with the eval::Parallel functionality of sferes to run parallel argos simulation threads. What do you think?
For now all I can say that it sounds a good idea and likely a tad more efficient than doing a single trial at once and then parallelising within the trial, but also that it would likely takes some work. I will have a look and see how difficult that is. I think it is probably less risky in terms of bugs as the multiple physics engines may be affecting some code that we wrote if we don't fully understand how that works.
For the rab sensors I have added the line
Okay. But please make sure the chosen grid size makes sense for us.
For the LED what would the range be ? Even though we are not using it in the experiments
I think we should go ahead with your suggestion and remove the LED sensors. What do you think?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/resilient-swarms/argos-sferes/issues/14?email_source=notifications&email_token=ACAO4KIQV45ZLEOUK2FIXTDP4SKCJA5CNFSM4H3XRRX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYWWWUA#issuecomment-506293072, or mute the thread https://github.com/notifications/unsubscribe-auth/ACAO4KJPGD5FCYUUJR3WUC3P4SKCJANCNFSM4H3XRRXQ .
I think we should go ahead with your suggestion and remove the LED sensors. What do you think?
For the baseline-behaviours I can keep the code as is (so there are no issues running them):
m_pcWheels = GetActuator<CCI_DifferentialSteeringActuator>("differential_steering");
m_pcWheelsEncoder = GetSensor
("differential_steering"); m_pcLeds = GetActuator ("thymio_led"); m_pcProximity = GetSensor ("Thymio_proximity"); m_pcGround = GetSensor ("Thymio_ground"); m_pcRABA = GetActuator ("range_and_bearing"); m_pcRABS = GetSensor ("range_and_bearing");
whereas for the NN we can do only the ones being used:
m_pcWheels = GetActuator<CCI_DifferentialSteeringActuator>("differential_steering");
m_pcProximity = GetSensor
("Thymio_proximity"); m_pcRABA = GetActuator ("range_and_bearing"); m_pcRABS = GetSensor ("range_and_bearing");
and remove those from the other ones from configuration as well.
I have taken a look at the MPGA code, it appears to be one process per individual. That seems to be the same as in sferes where they say in the wiki
// The evaluator is in charge of distributing the evaluation of the // population.
I will try to mimic that setup. I remember that there used to be some issue with eval::Parallel, is that still the case ?
On Thu, Jun 27, 2019 at 2:48 PM bossdm notifications@github.com wrote:
I think we should go ahead with your suggestion and remove the LED sensors. What do you think?
For the baseline-behaviours I can keep the code as is (so there are no issues running them):
m_pcWheels = GetActuator<CCI_DifferentialSteeringActuator>("differential_steering"); m_pcWheelsEncoder = GetSensor<CCI_DifferentialSteeringSensor>("differential_steering"); m_pcLeds = GetActuator<CCI_ThymioLedsActuator>("thymio_led"); m_pcProximity = GetSensor<CCI_ThymioProximitySensor>("Thymio_proximity"); m_pcGround = GetSensor<CCI_ThymioGroundSensor>("Thymio_ground"); m_pcRABA = GetActuator<CCI_RangeAndBearingActuator>("range_and_bearing"); m_pcRABS = GetSensor<CCI_RangeAndBearingSensor>("range_and_bearing");
whereas for the NN we can do only the ones being used:
m_pcWheels = GetActuator<CCI_DifferentialSteeringActuator>("differential_steering"); m_pcProximity = GetSensor<CCI_ThymioProximitySensor>("Thymio_proximity"); m_pcRABA = GetActuator<CCI_RangeAndBearingActuator>("range_and_bearing"); m_pcRABS = GetSensor<CCI_RangeAndBearingSensor>("range_and_bearing");
and remove those from the other ones from configuration as well.
I have taken a look at the MPGA code, it appears to be one process per individual. That seems to be the same as in sferes where they say in the wiki
// The evaluator is in charge of distributing the evaluation of the // population.
I will try to mimic that setup. I remember that there used to be some issue with eval::Parallel, is that still the case ?
In CMakeLists,txt change find_package(TBB) #COMPONENTS tbbmalloc tbbmalloc_proxy to find_package(TBB COMPONENTS tbbmalloc tbbmalloc_proxy)
In the main.cpp change eval::Eval to eval::Parallel
and set the number of threads in the .argos file to 4 (the number of cores on my laptop). Also disable visualizations.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/resilient-swarms/argos-sferes/issues/14?email_source=notifications&email_token=ACAO4KM5EN743FADBV7IVATP4TAKNA5CNFSM4H3XRRX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYXFQVY#issuecomment-506353751, or mute the thread https://github.com/notifications/unsubscribe-auth/ACAO4KLVADSABNFXZAJSEBDP4TAKNANCNFSM4H3XRRXQ .
for IRIDIS I found it necessary to write the exact library names. Adding the malloc and malloc_proxy helped the runs to not immediately stop. however, most still end up halting eventually with a segmentation fault. I guess this just means we still need to adjust the code to be similar to the multi-process GA.
I don't think using argos threads is necessary as the parallelisation is across individuals not within a trial. I did try both including threads and no threads but it did not make a difference.
Below is the output from valgrind leakcheck=full
=14801== Memcheck, a memory error detector ==14801== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==14801== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info ==14801== Command: ./bin/behaviour_evolcvt10D experiments/Gomes_walls_and_robots_std.argos ==14801== [INFO] Not using threads [INFO] Using random seed = 868053 [INFO] Using simulation clock tick = 0.1 [INFO] Total experiment length in clock ticks = 1200 [INFO] Loaded library "./lib/libnn_controller.so" [INFO] Loaded library "./lib/libevolution_loopfunctionscvt10D.so" [INFO] The physics engine "dyn2d" will perform 10 iterations per tick (dt = 0.01 sec) [INFO] No visualization selected.
Loaded 5000 centroids.
sferes2 version: (const char*)"f731bfb04a48a475dfedbfdb180d77054498557f"
seed: 1561733750
[INFO] Using random seed = 913433
==14801== Invalid read of size 4
==14801== at 0x4E5A576: _set_in (nn.hpp:509)
==14801== by 0x4E5A576: nn::NN<nn::Neuron<nn::PfWSum<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself> >, nn::AfTanh<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself> >, float>, nn::Connection<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself>, float> >::_step(std::vector<float, std::allocator
similar, but with 4 argos threads:
==14856== Memcheck, a memory error detector ==14856== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==14856== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info ==14856== Command: ./bin/behaviour_evolcvt10D experiments/Gomes_walls_and_robots_std.argos ==14856== [INFO] Using 4 parallel threads [INFO] Chosen method "balance_quantity": threads will be assigned the same [INFO] number of tasks, independently of the task length. [INFO] Using random seed = 776281 [INFO] Using simulation clock tick = 0.1 [INFO] Total experiment length in clock ticks = 1200 [INFO] Loaded library "./lib/libnn_controller.so" [INFO] Loaded library "./lib/libevolution_loopfunctionscvt10D.so" [INFO] The physics engine "dyn2d" will perform 10 iterations per tick (dt = 0.01 sec) [INFO] No visualization selected.
Loaded 5000 centroids.
sferes2 version: (const char)"f731bfb04a48a475dfedbfdb180d77054498557f"
seed: 1561733878
[INFO] Using random seed = 69350
==14856== Thread 7:
==14856== Invalid read of size 8
==14856== at 0x57A94AB: argos::CARGoSLog& argos::CARGoSLog::operator<< <char const>(char const) (in /usr/local/lib/argos3/libargos3core_simulator.so)
==14856== by 0x57E5076: argos::CSimulator::Reset() (in /usr/local/lib/argos3/libargos3core_simulator.so)
==14856== by 0x506D3BB: BaseLoopFunctions::start_trial(argos::CSimulator&) (base_loop_functions.cpp:288)
==14856== by 0x4E5504E: EvolutionLoopFunctions::start_trial(argos::CSimulator&) (evol_loop_functions.cpp:322)
==14856== by 0x506D82C: BaseLoopFunctions::perform_trial(argos::CSimulator&) (base_loop_functions.cpp:337)
==14856== by 0x506D887: BaseLoopFunctions::run_all_trials(argos::CSimulator&) (base_loop_functions.cpp:321)
==14856== by 0x1A5611: void sferes::FitObstacleMapElites<Params, stc::Itself>::eval<sferes::phen::Dnn<sferes::gen::Dnn<nn::Neuron<nn::PfWSum<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself> >, nn::AfTanh<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself> >, float>, nn::Connection<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself>, float>, ParamsDnn>, sferes::FitObstacleMapElites<Params, stc::Itself>, ParamsDnn, stc::Itself> >(sferes::phen::Dnn<sferes::gen::Dnn<nn::Neuron<nn::PfWSum<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself> >, nn::AfTanh<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself> >, float>, nn::Connection<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself>, float>, ParamsDnn>, sferes::FitObstacleMapElites<Params, stc::Itself>, ParamsDnn, stc::Itself>&) (evol_loop_functions.h:349)
==14856== by 0x1ACB46: sferes::eval::_parallel_evaluate<sferes::phen::Dnn<sferes::gen::Dnn<nn::Neuron<nn::PfWSum<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself> >, nn::AfTanh<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself> >, float>, nn::Connection<sferes::phen::Parameters<sferes::gen::EvoFloat<1, ParamsDnn, stc::Itself>, sferes::fit::FitDummy<stc::_Params, stc::Itself>, ParamsDnn, stc::Itself>, float>, ParamsDnn>, sferes::FitObstacleMapElites<Params, stc::Itself>, ParamsDnn, stc::Itself> >::operator()(tbb::blocked_range
I was able to apply the UpdateEntityStatus like this:
CEmbodiedEntity& entity = get_embodied_entity(m_unRobot);
CPhysicsModel* model;
bool moved = entity.MoveTo(
m_vecInitSetup[m_unCurrentTrial][m_unRobot].Position, // to this position
m_vecInitSetup[m_unCurrentTrial][m_unRobot].Orientation, // with this orientation
false // this is not a check, leave the robot there
);
for (size_t i=0; i < 4; ++i)
{
try{
model = &entity.GetPhysicsModel("dyn2d_"+std::to_string(i));
std::cout<<"Found the entity !"<<std::endl;
}
catch(argos::CARGoSException e){
continue;
}
}
model->UpdateEntityStatus();
Unfortunately, the error is still the same type of error:
[FATAL] Dynamics2D model id "thymio0" not found in dynamics 2D engine "dyn2d_1"
I won't have time to look into this tomorrow. Would suggest you try to debug it. If unsuccessful, can try and help you with it on Monday.
On Fri, 28 Jun 2019 15:22 bossdm, notifications@github.com wrote:
I was able to apply the UpdateEntityStatus like this:
CEmbodiedEntity& entity = get_embodied_entity(m_unRobot); CPhysicsModel* model; bool moved = entity.MoveTo( m_vecInitSetup[m_unCurrentTrial][m_unRobot].Position, // to this position m_vecInitSetup[m_unCurrentTrial][m_unRobot].Orientation, // with this orientation false // this is not a check, leave the robot there ); for (size_t i=0; i < 4; ++i) { try{ model = &entity.GetPhysicsModel("dyn2d_"+std::to_string(i)); std::cout<<"Found the entity !"<<std::endl; } catch(argos::CARGoSException e){ continue; } } model->UpdateEntityStatus();
Unfortunately, the error is still the same type of error:
[FATAL] Dynamics2D model id "thymio0" not found in dynamics 2D engine "dyn2d_1"
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/resilient-swarms/argos-sferes/issues/14?email_source=notifications&email_token=ACAO4KOO6GNMWYJTGVICCWDP4YNDRA5CNFSM4H3XRRX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY2G3WY#issuecomment-506752475, or mute the thread https://github.com/notifications/unsubscribe-auth/ACAO4KMTIBGVJUDDSYGJ3A3P4YNDRANCNFSM4H3XRRXQ .
I have run code with multiple physics engines now successfully and this greatly speeds up the code.
However, the manual setting of the robots is causing some issue:
The error can be avoided by commenting line 292 in the base_loop_functions.cpp which calls the _reset_agentpositions(). However then you would need some other means of distribtuing the agents per trial; I have noticed that the 'distribute' in the argos-config is not doing anything for random resets, even with random seed 0, if you look at the positions of robots or any cylinders. According to the code on https://github.com/ilpincy/argos3/blob/f853c1a324b96ad5539e6de7f325a381c9bc45cb/src/core/simulator/space/space.h, it seems that the distribute is only used at the Init.