tcstewar / 2015-Embodied_Benchmarks

Paper on Embodied Neuromorphic Benchmarks
GNU General Public License v2.0
0 stars 0 forks source link

L80 #19

Open celiasmith opened 8 years ago

celiasmith commented 8 years ago

https://github.com/tcstewar/2015-Embodied_Benchmarks/blob/master/paper/paper.tex#L80

This definition is problematic: "embodied task; that is, a task where the output from the neuromorphic hardware affects its future input."

That's more covers any task where there is a loop that the hardware is in, so I think it's a bit too general maybe. 'Embodied' strikes me as suggesting that the loop is specifically through the real physical world. so maybe "a task where the output from the neuromorphic hardware affects the physical world, which subsequently determines the future input to the hardware"

tcstewar commented 8 years ago

"a task where the output from the neuromorphic hardware affects the physical world, which subsequently determines the future input to the hardware"

Except that it doesn't do that in the simulation case.

I think I'm actually fine with it being any task with a loop. To me, that's the important part, and what makes these tasks interesting and different from all the other benchmarks I've seen.

celiasmith commented 8 years ago

so, the philosopher in me has a really hard time using 'embodied' to describe a case where there is no physical body. all the literature is pretty aligned on this point i think. I was thinking of the simulation as a way to support and make practical the actual embodied part of the benchmarking... not of the simulations as themselves counting as embodied benchmarks.

tcstewar commented 8 years ago

Hmm, the philosopher in me has a hard time with the idea that there should be a distinction between something embodied in the physical world and something embodied in a simulated world.... Do people really say that simulated robots don't count as embodied?

studywolf commented 8 years ago

this was something that caught me too, as i've also only ever heard embodied in terms of in hardware on a robot interacting with the real world.

i was thinking maybe 'dynamic loops' instead of embodied loops...?

celiasmith commented 8 years ago

http://plato.stanford.edu/entries/embodied-cognition/

Sorry... first line: "Cognition is embodied when it is deeply dependent upon features of the physical body of an agent, that is, when aspects of the agent's body beyond the brain play a significant causal or physically constitutive role in cognitive processing."

celiasmith commented 8 years ago

I must admit I'm also struggling a bit with the use of the term 'benchmark' here. I think typically benchmarks consist of tasks that have measures. In the paper I don't think we advocating that this method will help you generate tasks that have measures for embodied neuromorphic systems. It seems more like a method to generate and maybe 'pretest' algorithms that will do reasonably well on such benchmarks (we presumably wouldn't want to suggest that how it does in simulation (in the sense of the specific measures it generates) will tell us how it does on the embodied benchmarks... i think)

tcstewar commented 8 years ago

Yes, but all the arguments that I can see about that would also apply to someone in a simulated environment. I don't see anywhere where they make a distinction that says "aha! it only counts as embodied if the body is an actual real physical body, not one that exists inside some simulation somewhere".

tcstewar commented 8 years ago

it seems more like a method to generate and maybe 'pretest' algorithms that will do reasonably well on such benchmarks

Ah, I need to be clearer on that then. I don't mean at all for this to be a way of pretesting algorithms for use on some particular benchmarks. If you want an algorithm that is good at controlling some particular robot in someparticular task, then you better use that particular task as your measure.

I mean this to be a way of generating a benchmark that is useful. The whole point of a benchmark is that it lets me get a sense that a particular system is good overall. If I'm using a benchmark to decide between two different pieces of neuromorphic hardware that I might want to use for some new task I have in mind, I don't really care how well they perform on some other specific task. I care how well they're going to perform on my new task. But that's a totally new task, so no benchmark exists for it. But, if there's a benchmark that covers a huge randomly generated range of tasks and the benchmark shows that system A is pretty good across that whole space, but system B isn't, then I should probably go with system A on my new task.

tcstewar commented 8 years ago

(Btw, on the "embodied" point, I'm fairly sure I'd also be happy replacing "embodied" with "dynamic", as per @studywolf 's suggestion.)

celiasmith commented 8 years ago

I think embodied is something that would resonate better with the audience (dynamic is kind of generic); but so long as our use of the term doesn't sound implausible...

celiasmith commented 8 years ago

I actually think the burden of proof here isn't on me to find a place where someone says 'simulation doesn't count'... because they never talk about that. They always talk about the 'real world' and 'physics' and the 'real body'. That's why leaving it out is odd. If you want to change the definition... or have an at least unusual one, it would be good to do that explicitly. To do that, i'd suggest leaving in the more specific 'through the environment' thing, and then say that the environment could be simulated and it would still count.

tcstewar commented 8 years ago

I think embodied is something that would resonate better with the audience (dynamic is kind of generic); but so long as our use of the term doesn't sound implausible...

I agree that embodied resonates better with the audience... but I am a bit concerned that it could lead to this same confusion, though. It honestly never crossed my mind that someone wouldn't consider a simulated body to count as "embodied".... I'm clearly too much of a functionalist... :)

celiasmith commented 8 years ago

or a scifi fan :)

celiasmith commented 8 years ago

Sorry, we've got a couple things going on here... on benchmarks: So you want things like the graphs you're generating to be an example of a benchmark. And to say 'to get this benchmark, you need minimal simulation. and it's useful because the exact same benchmark can be run on a real robot with the same kind of results'.

tcstewar commented 8 years ago

I actually think the burden of proof here isn't on me to find a place where someone says 'simulation doesn't count'... because they never talk about that.

, well, according to Andy Clark, "matrix-based human intelligences would count as being as fully and richly embodied as you and I". :) http://www.philosophy.ed.ac.uk/people/clark/pubs/Matrixbody6.pdf
celiasmith commented 8 years ago

And to be specific, the benchmark would be: task: reaching to a location measure: rmse ... hmm it's not clear how this comes out of minimal simulation, per se. (people have already identified such things)... which is why i was confused; it seems that the focus is on having a way of identifying algorithms (like the evo people did) that will do well on benchmarks both in sim and in the real world.

tcstewar commented 8 years ago

And to be specific, the benchmark would be: task: reaching to a location measure: rmse ... hmm it's not clear how this comes out of minimal simulation, per se. (people have already identified such things)... which is why i was confused; it seems that the focus is on having a way of identifying algorithms (like the evo people did) that will do well on benchmarks both in sim and in the real world.

I think I'd say more that it's:

task: robustly controlling a bunch of state variables in the presence of unknown dynamics measure: rmse

The minimal simulation stuff is sort of an argument as to why you would want such a benchmark, because if you do this sort of benchmark, then there's a reasonable chance of it generalizing to new situations.

celiasmith commented 8 years ago

oh andy! now we're in thought experiment territory... nooooooo... in any case, the real issue here is about whether the environment should be mentioned... not whether it's real or simulated (to some impossibly perfect degree)... right? "a task where the output from the neuromorphic hardware affects its future input" needs to mention the environment to be a familiar usage... otherwise we just have a loop.

tcstewar commented 8 years ago

oh andy! now we're in thought experiment territory... nooooooo...

evil grins

in any case, the real issue here is about whether the environment should be mentioned... not whether it's real or simulated (to some impossibly perfect degree)... right? "a task where the output from the neuromorphic hardware affects its future input" needs to mention the environment to be a familiar usage... otherwise we just have a loop.

I completely agree. Being very explicit about there being an environment is needed there. And I think it'll be useful to also be explicit that we mean either simulated or real physical environments.

celiasmith commented 8 years ago

i agree on both counts.

celiasmith commented 8 years ago

benchmarks: Did we present a method that generated that benchmark? (I know i'm being a bit difficult, but it's in the interest of being really clear to the reader what's going on)

tcstewar commented 8 years ago

benchmarks: Did we present a method that generated that benchmark?

Hm. Well, we presented a different way of thinking about benchmarks. And suggested how one might go about defining these sorts of minimal-simulation-based benchmarks. But you're right it's not an explicit methodology...

(I know i'm being a bit difficult, but it's in the interest of being really clear to the reader what's going on)

I'm very very glad you're being difficult on this. It's all been stuck in my head in various different half-baked forms, and this is the first time I've even tried to express it, let alone justify it.... :)

celiasmith commented 8 years ago

So that benchmark seems a bit general also... Specific benchmarks we show are presumably things like:

task: move select state variables (position) to desired state under a delay measure: rmse

And doing well is low rmse for high delay... (or some range of delays). So I think you're right, the robustness needs to be in there somehow (e.g. the delay part). I'm still struggling with the role of minimal simulation in generating the benchmark. (I agree it's a nice argument about why to have robust benchmarks)

celiasmith commented 8 years ago

Well, we presented a different way of thinking about benchmarks.

Totally agree. And it's a really nice way of thinking about them that is important for these folks.

celiasmith commented 8 years ago

I should also say I like the notion of a 'hybrid' approach too... which is why i moved it earlier in the abstract (oh, i haven't sent my edits yet... let me do that).

celiasmith commented 8 years ago

Think i'll sleep on it for now :smiley: ... :sleeping: