Open nikita-petrashen opened 2 years ago
The ObjectID
is generated for each sim object in a scene, and is comprised of the object's type as well as its position in world space
Internally, this value is generated once at the start of opening a new scene in THOR, and is again regenerated when the SetObjectPoses
action is run. For example, an Apple
in some scene could have a default position of (-02.08, 00.94, -03.62)
which, if you just loaded this scene without doing any object rearrangement via actions like SetObjectPoses
or InitialRandomSpawn
, the ObjectID
for this apple would be Apple|-02.08|+00.94|-03.62
If, however, after initially loading up this scene, you ran SetObjectPoses
to rearrange the apple to some new default location, say (0, 0, 0), then the apple's ObjectID
would change to Apple|+00.00|+00.00|+00.00
since the new "default" or starting location of the apple has been set by SetObjectPoses
.
Note that actions like PickupObject
and PutObject
do NOT change the ObjectID
, as even though these can reposition objects, they are used for manipulation after a scene has been set to its desired default configuration. The SetObjectPoses
action is explicitly meant to be used for such initialization, and should be considered as an extension of the initialization process on top of setting up the Controller
with your desired parameters like field of view and resolution.
Since SetObjectPoses
can change the ObjectID
of objects, in order to track which object is being affected in a scene, there is another unique identifier in a sim object's metadata called the name
. The name of an object is the type of the object appended with a unique string identifier that never changes, even when reloading the scene or repositioning the object via SetObjectPoses
. This means you can use the object name
to identify a specific object regardless of your initialization process.
This information and more is returned as metadata that each object in a scene that is interactable has. This object metadata gives info of the state of the object, position, name, current ObjectID, and more that can be used to make sure you know exactly which object you are trying to interact with in what way. Field like pickupable
and moveable
also indicate the types of actions you can perform on said object, so for example something static like a Countertop
object would not be able to be picked up and moved by the agent since it is a structure built into the side of a wall in a kitchen, but an object like a Book
is pickupable. All this information is annotated in the object metadata, and further details of the types of interactions and what actions to call in order to perform them are on our documentation site.
Additionally, there is currently not a way to save the unity scene file after running SetObjectPoses
as an action. This would have to be custom functionality added to the Unity Editor itself, as you will need to interface with the Editor in order to save the scene as a new version of the scene after using something like SetObjectPoses
. Currently this isn't exposed via the python interface as that only interacts with the build of Unity, not the editor itself. However, including the SetObjectPoses
action as part of your initialization step will be functionally the same as saving the unity scene as a new scene and loading that, so unless you need the unity file itself as its own manipulatable asset, you should be able to replicate any scene configuration generated via SetObjectPoses
by just running SetObjectPoses
itself after initializing the controller.
Thanks for the detailed answer Winson!
Regarding the last part of your answer, I need the unity file itself exactly. I've been digging through the C# source code and could not find where does the loading of the unity file happen. I'm not really familiar with C#, but my guess is that in order to run SetObjectPoses
and save the scene after that as a unity file I will have to implement a new ServerAction
/ DynamicServerAction
which does the saving and then send the respective commands to the Python Controller
. Is that a correct way?
Thanks!
And the last question: when we call SetObjectPoses
and pass the following dict:
Does position
entry correspond to the center of the bounding box or something else?
Regarding the last part of your answer, I need the unity file itself exactly. I've been digging through the C# source code and could not find where does the loading of the unity file happen. I'm not really familiar with C#, but my guess is that in order to run SetObjectPoses and save the scene after that as a unity file I will have to implement a new ServerAction / DynamicServerAction which does the saving and then send the respective commands to the Python Controller. Is that a correct way?
So to actually edit and then save the scene file, this will need to be done without the python controller as this must be done from the Unity editor itself. The python interface is only meant to interact directly with a build of THOR from Unity, not the asset files themselves from within the Unity editor.
Since it seems like what you are wanting is to save some new .unity
scene files after having made changes to them, what you will likely need to do is something like serialize out the information of the object poses that you would put through SetObjectPoses
and then apply them to a scene, then save those changes as a new scene all from within the Unity Editor itself. You may need to do something like create an editor function that will essentially allow you to manipulate a scene as if you were using one of Unity's built-in dialogue boxes. This will allow you to make changes to a scene via a script, and then save the scene as a new .unity
asset file.
This sort of leads into your other question.
And the last question: when we call SetObjectPoses and pass the following dict: Does position entry correspond to the center of the bounding box or something else?
This position corresponds to the position field of the game object's Transform
component. First of all, there is the concept of a GameObject
within Unity. Basically game objects are any objects that exist within a scene. This can be things like a character, environment objects, lights, sound emitters etc. Not all game objects have components on them that have a mesh, a renderer, etc, but all of them have a Transform
component.
The Transform
component represents a sort of pivot point that a game object can be manipulated from. Wherever this transform is centered about an object, all position and rotation changes are applied about that point.
The Transform
center is not necessarily the center of the bounding box. Some objects have their transform at the average center of the mesh of an object, but often times the transform is centered about a different point, so this is not guaranteed.
Take for example these two objects, a statue and an apple. The apple's transform is centered around the average center of the apple object. Moving it moves the entire apple based on the center, and rotating the apple will change the Rotation
field of the Transform
by rotating it about the center of the apple. However, the statue's transform is centered closer to the base of the statue rather than the average center of the statue. This means when something like the rotation of the statue is modified within the statue's Transform
it will rotate about its base where the transform is rather than about the center.
One thing to potentially look into is to recreate the logic used in the SetObjectPoses
script, but integrate it into an editor function that will allow it to work in "editor-time" rather than runtime. SetObjectPoses
will only execute in-editor if you hit the Play
button up at the top of the editor, going into runtime mode. However, any changes to objects made during runtime will not be saved in the scene, as you can only save the scene while not in run mode but only in "editor mode." So one thing you may need to look into is to either store all the position/rotation changes you want to make to objects in the scene in some sort of serialized way, and to then load it up into the editor function to make changes that can be saved. You may also be able to do something like make changes in runtime mode but then serialize out those changes, which can then be applied again "for real" in editor mode.
Thanks a lot, this is very helpful!
Hi!
I'm working with ALFRED right now and need to directly manipulate AI2-THOR scenes (in the form of point clouds) according to the tasks that are given. I was wondering how to interpret the
ObjectID
string which is in the ALFRED task descriptions (see pic below) to know with which object do we interact with exactly?In the issue I've opened the creators sent me here to ask this questions, because this is an intrinsic THOR API mechanism.
Side question: if we set the state of the scene by sending
SetObjectPoses
action, can we save the new state in a.unity
file? Maybe a few hints on how to add this functionality if it's not there? Would be great!Thanks!