sotopia-lab / sotopia

Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)
https://docs.sotopia.world
MIT License
127 stars 16 forks source link

Update dependency list and move optional ones to extra dependencies #80

Closed ProKil closed 1 month ago

ProKil commented 1 month ago

Close #59

This is the start of an ongoing effort to make sotopia dependency slimmer.

Similar to tinygrad's point, having less dependencies will make more people easy to use our library with other tools that they are using (no need to resolve conflicts that don't actually exist).

I think the only 8 dependencies that we should have are:

  1. pydantic: for data validation and abstraction,
  2. redis-om: for data serialization, logging, and communication,
  3. tqdm: for fun,
  4. aiohttp: for calling RESTAPI,
  5. beartype: for type guarding,
  6. lxml: for handling different visibilities
  7. jinja2: for prompt templates
  8. and gin: for configuration, and we should remove it in the end as well.

Options:

[example]:

[litellm]:

[chat]:

[testing]:

[dev]:

📑 Description

In this PR, I made progress towards this goal --

This is the current required dependency list

pandas = "^2.1.1" lxml = "^4.9.3" openai = "^1.11.0" langchain = "0.1.5" rich = "^13.6.0" PettingZoo = "1.24.0" redis-om = "^0.2.1" pandas-stubs = "" types-tqdm = "" gin-config = "^0.5.0" absl-py = "^2.0.0" together = "^0.2.4" pydantic = "1.10.12" beartype = "^0.14.0" langchain-openai = "^0.0.5" litellm = "^1.23.12"

We now have 14 required dependencies (except for stubs), and removing the other 6 is hard for the following reasons.

  1. pandas is only used for converting a dict into a csv. This should be a feature that we can easily implement.
  2. openai and together are our "default" LLMs which we could do without, but requires more changes.
  3. rich for printing, which we could conditionally use.
  4. PettingZoo is our initial abstraction for environments, which we have in fact deviated from. We could do without it in a small PR.
  5. litellm could be optional in a following PR

Here are our optional dependencies:

fastapi = { version = "^0.109.2", optional = true } tabulate = { version = "^0.9.0", optional = true } anthropic = { version = "^0.26.0", optional = true } xmltodict = { version = "^0.13.0", optional = true } groq = { version = "^0.4.2", optional = true } cohere = { version = "^5.1.8", optional = true } google-generativeai = { version = "^0.5.4", optional = true } transformers = { version = "^4.41.0", optional = true } datasets = { version = "^2.19.0", optional = true } scipy = { version = "^1.13.1", optional = true } torch = { version = "^2.3.0", optional = true }

✅ Checks

ℹ Additional Information

ProKil commented 1 month ago

@XuhuiZhou

I added datasets back to examples extra. You will be able install it with either pip install sotopia[examples] or poetry install -E examples.

Now we have two different kinds of dependencies:

  1. Required dependencies are for the included files -- ./sotopia.
  2. Optional dependencies (extras) for either:
    1. external LLM support
    2. OR scripts outside of the sotopia, incl. examples and sotopia-chat
ProKil commented 1 month ago

lmlib is removed already? I will reply after I add the docs

ProKil commented 1 month ago

@XuhuiZhou the doc is added

ProKil commented 1 month ago

@XuhuiZhou @lwaekfjlk Please check the current code and approve if it looks good.

As for the efficiency of CI, I think it currently looks OK since we spend around 7 minutes per push which is around 300 pushes per month or 10 pushes per day. We can make this more efficient in a separate PR according to the various solutions in #79

lwaekfjlk commented 1 month ago

@XuhuiZhou @lwaekfjlk Please check the current code and approve if it looks good.

As for the efficiency of CI, I think it currently looks OK since we spend around 7 minutes per push which is around 300 pushes per month or 10 pushes per day. We can make this more efficient in a separate PR according to the various solutions in #79

yes, sure. Sorry, I just checked again. 2000 min CI is for all private repo, public ones are not included.