ManifoldRG / Manifold-KB

This repository serves as a knowledge base with key insights, details from other research and implementations to serve as references and one place to document various possible paths to achieve something.
GNU General Public License v3.0
4 stars 0 forks source link

AF Survey - Non Language Centric Agents #14

Open harshsikka opened 1 year ago

harshsikka commented 1 year ago

Most of the modern agent based literature has been focused on using LLMs as the core controller in an "agent". See ManifoldRG/Manifold-KB#18

This makes sense given the general applicability of language and the representational capacity of LLMs. But, what other kinds of agents are there?

We're looking to collect and understand examples of agents that might be vision input based, or even multimodal agents (language paired w/ other modalities).

Output: Candidate papers for these kinds of agents

helenlu66 commented 1 year ago

dumping some things I've come across:

  1. This paper gives a point of view from the field of embodied AI. Embodied AI agents learn through interactions with their environment (often with simulators of real world physical environments). This paper focuses on visual exploration, visual navigation, and embodied QA tasks. A_Survey_of_Embodied_AI_From_Simulators_to_Research_Tasks.pdf
  2. This is the mind2web agent we talked about Towards A Generalist Web Agent
  3. Action Transformer . ACT-1 Adept Action Transformer
  4. Cognitive architectures for modeling human cognition and building AGIs: ACT-R, SOAR, survey of cognitive architectures in past 20 years, The Soar of Cognitive Architectures
bhavul commented 1 year ago

Thanks folks. Now that we're expanding the Survey effort to community, it could be useful to edit the above posts to follow the "Issue format" specified in community guidelines whenever you have a few minutes : https://docs.google.com/document/d/1LPCl8ivbPQsEx96sGBPeCY7AM8PWh7RUcy-TNNJkQ50/edit#heading=h.v41a31azfs4m