hansonrobotics / robo_blender

ROS node for blender face-head-neck control
12 stars 12 forks source link

Eye-cam tracking #5

Open Gaboose opened 9 years ago

Gaboose commented 9 years ago

Considering that we're getting more people helping us out on hansonrobotics code, I feel like I should share my plans publicly. Vytas assigned me to make the robot use its in-eye-cam when tracking faces. While I failed to commit to some actual code for that, I spent my time thinking and writing this doc.

Comments welcome, but I'm afraid the doc might be a little confusing - it was originally meant to be just personal notes. Gonna start next week if nothing changes.

linas commented 9 years ago

Hi Gabrieliau,

Hard to comment, as I haven't quite figured out the parts. According to github, robo_blender is a "A simple model of the HR neck system" but surely that description cannot be right!? The readme then says "This Blender rig uses inputs package to map sensor information into Blender space and outputs package to map the Blender rig onto ROS controllers." but I can't quite figure out what that means. It says that I should "See inputs/init.py, outputs/init.py and modes/init.py for more documentation." didn't really help, I'm still confused.

Looking at outputs/init.py, it seems to suggest that the current position, orientation of the blender rig is reported as an output ros message, is that right?. This is so that other things can find out the current head/face/eyes/lips position?? it does not say anywhere waht the outputs actually are, nor how to get/obtain those outputs. I guess that maybe these are being published in ROS, somehow??

I can't quite figure out what the inputs are supposed to be. .. It says "camera tracking inputs" but what does that mean? Are we sending raw video to blender? What does blender do with raw video? Are the input messages something other than raw video?

I'd like to experiment with it -- I opened the blend files, saw the two different heads, but could not figure out what to do with them...

I also noticed the last update to the master branch was June 6 2014, but there there has been a lot of effort on a branch called "modular-dev" -- that branch has not been folded back into master .. is it because it doesn't work? It seems very unusual to me to do more than one or two days work, without folding it back into master, yet here, there is three months worth of work! I don't know what that means, or how to interpret that ... a shuffling zombie, not alive, and not dead?

pi_vision is similar-- several months of work that has not been folded into master.

The project description says: "Pi Vision ported to ROS Indigo." I know what ROS Indigo is, but I don't know what Pi Vision is. .. what does it do?

Prowling around, it seems to have the capability of tracking faces ... but what does it report: the 2D position and orientation of a face relative to the camera field of view? Or does it report the position of the face in a 3D world map?

In your trello card, you keep talking about "eyes" -- but from what I can tell, you are not tracking the gaze -- where the eyes are looking. You are only tracking their location? Two points do give a position and orientation ... but different people have different-sized heads, so you don't know the size of the head. I guess you can make a rough guess of the distance to the head(s). I don't know why 'distance to face' is immediately useful ... just one of those things...

Can it track multiple faces? Can they be tracked through occulsions? If, for example, eva was presented to an admiring crowd of 10 eager investors/customers, would this be able to track all 10 faces, even as they bob around for a better look? If not 10, then what is the limit? 2? 3? 20? 50?

All of the software that I'm finding seems to be more or less completely undocumented, in that even basic questions, like "what does it do?" are major undertakings. This makes following development impossible, for me.

--linas

On Thu, Nov 13, 2014 at 9:53 AM, Gaboose notifications@github.com wrote:

Considering that we're getting more people helping us out on hansonrobotics code, I feel like I should share my plans publicly. Vytas assigned me to make the robot use its in-eye-cam when tracking faces. While I failed to commit to some actual code for that, I spent my time thinking and writing this doc https://trello.com/c/Gl9gv7l7/13-face-tracking-using-an-eye-camera.

Comments welcome, but I'm afraid the doc might be a little confusing - it was originally meant to be just personal notes. Gonna start next week if nothing changes.

— Reply to this email directly or view it on GitHub https://github.com/hansonrobotics/robo_blender/issues/5.

linas commented 9 years ago

Ugh.

Right after I sent the last message, it occurred to me that I should googel "pi vision". I get two hits: one is that it is some sorft of GUI for the pasberry PI camera. The other is that ther is a ROS project with the same name. http://wiki.ros.org/pi_vision

From what I can tell, the hanson robotics code is a fork of original upstream ROS code, with incompatible changes made to it. This begs multiple questions: why are incompatible changes being made? Why are those changes not being sent back upstream? It seems like a bit of a waste of time and effort to fork something, hack on it, but then fail to merge back upstream. ... in my experience, upstream usually evolves, and so all proprietary changes become dead-end, throw away code. I can't tell you how much I hate writing throw-away code. I feel like its a waste of talent.

So I'm continuing to be ever-more confused ... I'm quite unsure what to do about it.

-- Linas

On Thu, Nov 13, 2014 at 12:40 PM, Linas Vepstas linasvepstas@gmail.com wrote:

Hi Gabrieliau,

Hard to comment, as I haven't quite figured out the parts. According to github, robo_blender is a "A simple model of the HR neck system" but surely that description cannot be right!? The readme then says "This Blender rig uses inputs package to map sensor information into Blender space and outputs package to map the Blender rig onto ROS controllers." but I can't quite figure out what that means. It says that I should "See inputs/init.py, outputs/init.py and modes/init.py for more documentation." didn't really help, I'm still confused.

Looking at outputs/init.py, it seems to suggest that the current position, orientation of the blender rig is reported as an output ros message, is that right?. This is so that other things can find out the current head/face/eyes/lips position?? it does not say anywhere waht the outputs actually are, nor how to get/obtain those outputs. I guess that maybe these are being published in ROS, somehow??

I can't quite figure out what the inputs are supposed to be. .. It says "camera tracking inputs" but what does that mean? Are we sending raw video to blender? What does blender do with raw video? Are the input messages something other than raw video?

I'd like to experiment with it -- I opened the blend files, saw the two different heads, but could not figure out what to do with them...

I also noticed the last update to the master branch was June 6 2014, but there there has been a lot of effort on a branch called "modular-dev" -- that branch has not been folded back into master .. is it because it doesn't work? It seems very unusual to me to do more than one or two days work, without folding it back into master, yet here, there is three months worth of work! I don't know what that means, or how to interpret that ... a shuffling zombie, not alive, and not dead?

pi_vision is similar-- several months of work that has not been folded into master.

The project description says: "Pi Vision ported to ROS Indigo." I know what ROS Indigo is, but I don't know what Pi Vision is. .. what does it do?

Prowling around, it seems to have the capability of tracking faces ... but what does it report: the 2D position and orientation of a face relative to the camera field of view? Or does it report the position of the face in a 3D world map?

In your trello card, you keep talking about "eyes" -- but from what I can tell, you are not tracking the gaze -- where the eyes are looking. You are only tracking their location? Two points do give a position and orientation ... but different people have different-sized heads, so you don't know the size of the head. I guess you can make a rough guess of the distance to the head(s). I don't know why 'distance to face' is immediately useful ... just one of those things...

Can it track multiple faces? Can they be tracked through occulsions? If, for example, eva was presented to an admiring crowd of 10 eager investors/customers, would this be able to track all 10 faces, even as they bob around for a better look? If not 10, then what is the limit? 2? 3? 20? 50?

All of the software that I'm finding seems to be more or less completely undocumented, in that even basic questions, like "what does it do?" are major undertakings. This makes following development impossible, for me.

--linas

On Thu, Nov 13, 2014 at 9:53 AM, Gaboose notifications@github.com wrote:

Considering that we're getting more people helping us out on hansonrobotics code, I feel like I should share my plans publicly. Vytas assigned me to make the robot use its in-eye-cam when tracking faces. While I failed to commit to some actual code for that, I spent my time thinking and writing this doc https://trello.com/c/Gl9gv7l7/13-face-tracking-using-an-eye-camera.

Comments welcome, but I'm afraid the doc might be a little confusing - it was originally meant to be just personal notes. Gonna start next week if nothing changes.

— Reply to this email directly or view it on GitHub https://github.com/hansonrobotics/robo_blender/issues/5.

Gaboose commented 9 years ago

I hate throw-away code too. But that's what we had to do to get quick results for the demos. Maybe its time to draw some overall graphs about how the nodes interact and what they can be arranged to do. Then it would be easier to figure out how they can be decoupled and made reusable.

Im also tempted to find you on Freenode and answer all of your questions live. On Nov 13, 2014 8:49 PM, "Linas Vepstas" notifications@github.com wrote:

Ugh.

Right after I sent the last message, it occurred to me that I should googel "pi vision". I get two hits: one is that it is some sorft of GUI for the pasberry PI camera. The other is that ther is a ROS project with the same name. http://wiki.ros.org/pi_vision

From what I can tell, the hanson robotics code is a fork of original upstream ROS code, with incompatible changes made to it. This begs multiple questions: why are incompatible changes being made? Why are those changes not being sent back upstream? It seems like a bit of a waste of time and effort to fork something, hack on it, but then fail to merge back upstream. ... in my experience, upstream usually evolves, and so all proprietary changes become dead-end, throw away code. I can't tell you how much I hate writing throw-away code. I feel like its a waste of talent.

So I'm continuing to be ever-more confused ... I'm quite unsure what to do about it.

-- Linas

On Thu, Nov 13, 2014 at 12:40 PM, Linas Vepstas linasvepstas@gmail.com wrote:

Hi Gabrieliau,

Hard to comment, as I haven't quite figured out the parts. According to github, robo_blender is a "A simple model of the HR neck system" but surely that description cannot be right!? The readme then says "This Blender rig uses inputs package to map sensor information into Blender space and outputs package to map the Blender rig onto ROS controllers." but I can't quite figure out what that means. It says that I should "See inputs/init.py, outputs/init.py and modes/init.py for more documentation." didn't really help, I'm still confused.

Looking at outputs/init.py, it seems to suggest that the current position, orientation of the blender rig is reported as an output ros message, is that right?. This is so that other things can find out the current head/face/eyes/lips position?? it does not say anywhere waht the outputs actually are, nor how to get/obtain those outputs. I guess that maybe these are being published in ROS, somehow??

I can't quite figure out what the inputs are supposed to be. .. It says "camera tracking inputs" but what does that mean? Are we sending raw video to blender? What does blender do with raw video? Are the input messages something other than raw video?

I'd like to experiment with it -- I opened the blend files, saw the two different heads, but could not figure out what to do with them...

I also noticed the last update to the master branch was June 6 2014, but there there has been a lot of effort on a branch called "modular-dev" -- that branch has not been folded back into master .. is it because it doesn't work? It seems very unusual to me to do more than one or two days work, without folding it back into master, yet here, there is three months worth of work! I don't know what that means, or how to interpret that ... a shuffling zombie, not alive, and not dead?

pi_vision is similar-- several months of work that has not been folded into master.

The project description says: "Pi Vision ported to ROS Indigo." I know what ROS Indigo is, but I don't know what Pi Vision is. .. what does it do?

Prowling around, it seems to have the capability of tracking faces ... but what does it report: the 2D position and orientation of a face relative to the camera field of view? Or does it report the position of the face in a 3D world map?

In your trello card, you keep talking about "eyes" -- but from what I can tell, you are not tracking the gaze -- where the eyes are looking. You are only tracking their location? Two points do give a position and orientation ... but different people have different-sized heads, so you don't know the size of the head. I guess you can make a rough guess of the distance to the head(s). I don't know why 'distance to face' is immediately useful ... just one of those things...

Can it track multiple faces? Can they be tracked through occulsions? If, for example, eva was presented to an admiring crowd of 10 eager investors/customers, would this be able to track all 10 faces, even as they bob around for a better look? If not 10, then what is the limit? 2? 3? 20? 50?

All of the software that I'm finding seems to be more or less completely undocumented, in that even basic questions, like "what does it do?" are major undertakings. This makes following development impossible, for me.

--linas

On Thu, Nov 13, 2014 at 9:53 AM, Gaboose notifications@github.com wrote:

Considering that we're getting more people helping us out on hansonrobotics code, I feel like I should share my plans publicly. Vytas assigned me to make the robot use its in-eye-cam when tracking faces. While I failed to commit to some actual code for that, I spent my time thinking and writing this doc https://trello.com/c/Gl9gv7l7/13-face-tracking-using-an-eye-camera.

Comments welcome, but I'm afraid the doc might be a little confusing - it was originally meant to be just personal notes. Gonna start next week if nothing changes.

— Reply to this email directly or view it on GitHub https://github.com/hansonrobotics/robo_blender/issues/5.

— Reply to this email directly or view it on GitHub https://github.com/hansonrobotics/robo_blender/issues/5#issuecomment-62944654 .

linas commented 9 years ago

Gabus,

Yes, diagrams would be great -- figuring out what's connected to what has been a bit of a challenge. I'd like to get them published on github somewhere ... perhaps in a new project, lets call it "Software" so that it mirrors

https://github.com/hansonrobotics/Hardware/wiki

I'd like it on a wiki, because that way multiple people can update it, on an as-needed basis. Its also integrated into git, so there's revision control .. and anyone can pull the sources. By contrast, neither Trello nor Google Docs can do this, or at least not well? I haven't yet found how to track the revision history of documents in google docs. But I'm not too picky, just right now.

I'm on freenode as linas on #opencog and other channels which I mostly just ignore. I can chat but really the next person after me will have the same questions, and so we should find a more permannt solution. A high-level chart showing what connnects to what would be a great start.

--linas

On Thu, Nov 13, 2014 at 3:14 PM, Gaboose notifications@github.com wrote:

I hate throw-away code too. But that's what we had to do to get quick results for the demos. Maybe its time to draw some overall graphs about how the nodes interact and what they can be arranged to do. Then it would be easier to figure out how they can be decoupled and made reusable.

Im also tempted to find you on Freenode and answer all of your questions live. On Nov 13, 2014 8:49 PM, "Linas Vepstas" notifications@github.com wrote:

Ugh.

Right after I sent the last message, it occurred to me that I should googel "pi vision". I get two hits: one is that it is some sorft of GUI for the pasberry PI camera. The other is that ther is a ROS project with the same name. http://wiki.ros.org/pi_vision

From what I can tell, the hanson robotics code is a fork of original upstream ROS code, with incompatible changes made to it. This begs multiple questions: why are incompatible changes being made? Why are those changes not being sent back upstream? It seems like a bit of a waste of time and effort to fork something, hack on it, but then fail to merge back upstream. ... in my experience, upstream usually evolves, and so all proprietary changes become dead-end, throw away code. I can't tell you how much I hate writing throw-away code. I feel like its a waste of talent.

So I'm continuing to be ever-more confused ... I'm quite unsure what to do about it.

-- Linas

On Thu, Nov 13, 2014 at 12:40 PM, Linas Vepstas linasvepstas@gmail.com

wrote:

Hi Gabrieliau,

Hard to comment, as I haven't quite figured out the parts. According to github, robo_blender is a "A simple model of the HR neck system" but surely that description cannot be right!? The readme then says "This Blender rig uses inputs package to map sensor information into Blender space and outputs package to map the Blender rig onto ROS controllers." but I can't quite figure out what that means. It says that I should "See inputs/init.py, outputs/init.py and modes/init.py for more documentation." didn't really help, I'm still confused.

Looking at outputs/init.py, it seems to suggest that the current position, orientation of the blender rig is reported as an output ros message, is that right?. This is so that other things can find out the current head/face/eyes/lips position?? it does not say anywhere waht the outputs actually are, nor how to get/obtain those outputs. I guess that maybe these are being published in ROS, somehow??

I can't quite figure out what the inputs are supposed to be. .. It says "camera tracking inputs" but what does that mean? Are we sending raw video to blender? What does blender do with raw video? Are the input messages something other than raw video?

I'd like to experiment with it -- I opened the blend files, saw the two different heads, but could not figure out what to do with them...

I also noticed the last update to the master branch was June 6 2014, but

there there has been a lot of effort on a branch called "modular-dev"

that branch has not been folded back into master .. is it because it doesn't work? It seems very unusual to me to do more than one or two days work, without folding it back into master, yet here, there is three months worth of work! I don't know what that means, or how to interpret that ... a shuffling zombie, not alive, and not dead?

pi_vision is similar-- several months of work that has not been folded into master.

The project description says: "Pi Vision ported to ROS Indigo." I know what ROS Indigo is, but I don't know what Pi Vision is. .. what does it do?

Prowling around, it seems to have the capability of tracking faces ... but what does it report: the 2D position and orientation of a face relative to the camera field of view? Or does it report the position of the face in a 3D world map?

In your trello card, you keep talking about "eyes" -- but from what I can tell, you are not tracking the gaze -- where the eyes are looking. You are only tracking their location? Two points do give a position and orientation ... but different people have different-sized heads, so you don't know the size of the head. I guess you can make a rough guess of the distance to the head(s). I don't know why 'distance to face' is immediately useful ... just one of those things...

Can it track multiple faces? Can they be tracked through occulsions? If, for example, eva was presented to an admiring crowd of 10 eager investors/customers, would this be able to track all 10 faces, even as they bob around for a better look? If not 10, then what is the limit? 2? 3? 20? 50?

All of the software that I'm finding seems to be more or less completely undocumented, in that even basic questions, like "what does it do?" are major undertakings. This makes following development impossible, for me.

--linas

On Thu, Nov 13, 2014 at 9:53 AM, Gaboose notifications@github.com wrote:

Considering that we're getting more people helping us out on hansonrobotics code, I feel like I should share my plans publicly. Vytas assigned me to make the robot use its in-eye-cam when tracking faces. While I failed to commit to some actual code for that, I spent my time thinking and writing this doc https://trello.com/c/Gl9gv7l7/13-face-tracking-using-an-eye-camera.

Comments welcome, but I'm afraid the doc might be a little confusing

it was originally meant to be just personal notes. Gonna start next week if nothing changes.

— Reply to this email directly or view it on GitHub https://github.com/hansonrobotics/robo_blender/issues/5.

— Reply to this email directly or view it on GitHub < https://github.com/hansonrobotics/robo_blender/issues/5#issuecomment-62944654>

.

— Reply to this email directly or view it on GitHub https://github.com/hansonrobotics/robo_blender/issues/5#issuecomment-62967210 .

linas commented 9 years ago

Gabrieliau, koks tavo email'as ? Noriu susirašyti -- linas

Gaboose commented 9 years ago

Va toks: gabrielius.m@gmail.com Dabar einu miegot, bet už kokių 9 val atrašysiu. On Nov 19, 2014 12:14 AM, "Linas Vepštas" notifications@github.com wrote:

Gabrieliau, koks tavo email'as ? Noriu susirašyti -- linas

— Reply to this email directly or view it on GitHub https://github.com/hansonrobotics/robo_blender/issues/5#issuecomment-63555458 .

Gaboose commented 9 years ago

Here's three setups that I'm most familiar with. I put the markdown file in gist, but we can push it to "Software" repo once we have it and keep the diagrams updated. And yes, those are markdown sequence diagrams :)