Closed rainbow979 closed 7 months ago
All datasets have some language instruction stored, so you can always extract that from the fields "language_instruction" and "language_embedding" (which is a continuous embedding vector from a pre-trained sentence embedder). For the datasets that are listed as "no langauge" on the spreadsheet, the language instruction is a constant dummy string like "cable routing" or "pick anything" or "do something".
The fields are called "natural_language_instruction" and " natural_language_embedding" respectively in the "observation" dictionary in each dataset that you can check it out. Specifically for the cable routing, I believe the instruction is just "route cable".
On Thu, Nov 2, 2023 at 8:23 AM 'Karl Pertsch' via Open X-Embodiment Collaboration @.***> wrote:
All datasets have some language instruction stored, so you can always extract that from the fields "language_instruction" and "language_embedding" (which is a continuous embedding vector from a pre-trained sentence embedder). For the datasets that are listed as "no langauge" on the spreadsheet, the language instruction is a constant dummy string like "cable routing" or "pick anything" or "do something".
— Reply to this email directly, view it on GitHub https://github.com/google-deepmind/open_x_embodiment/issues/16#issuecomment-1790629556, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDPTRVL3MOUYV7L6MU3BAFTYCOGDRAVCNFSM6AAAAAA62KHH76VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJQGYZDSNJVGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- You received this message because you are subscribed to the Google Groups "Open X-Embodiment Collaboration" group. To unsubscribe from this group and stop receiving emails from it, send an email to @.*** To view this discussion on the web visit https://groups.google.com/d/msgid/open-x-embodiment/google-deepmind/open_x_embodiment/issues/16/1790629556%40github.com https://groups.google.com/d/msgid/open-x-embodiment/google-deepmind/open_x_embodiment/issues/16/1790629556%40github.com?utm_medium=email&utm_source=footer .
I have noted that some datasets are marked no language description in the spreadsheets. How to get language embedding for such datasets?
Some (e.g. cable routing dataset) directly used the same language like "cable routing". How about others?