[BUG] Docker env errors with 10.3.2 + Image Size concerns

cooperlees commented 1 year ago

Describe the bug The docker image has grown quite a lot - Was this expected?

kaveenk/gpt3discord              latest     b4c0677089fe   15 hours ago    4.07GB

I also updated to latest_release and latest and get the following error (So I am guessing I'm 10.3.2)

Loading environment from .env
Loading environment from /opt/gpt3discord/etc/environment
Loading environment from None
Attempting to retrieve the settings DB
Retrieved the settings DB
Traceback (most recent call last):
  File "/opt/gpt3discord/bin/gpt3discord.py", line 14, in <module>
    from cogs.search_service_cog import SearchService
  File "/usr/local/lib/python3.9/site-packages/cogs/search_service_cog.py", line 13, in <module>
    ALLOWED_GUILDS = EnvService.get_allowed_guilds()
  File "/usr/local/lib/python3.9/site-packages/services/environment_service.py", line 87, in get_allowed_guilds
    allowed_guilds = [int(guild) for guild in allowed_guilds]
  File "/usr/local/lib/python3.9/site-packages/services/environment_service.py", line 87, in <listcomp>
    allowed_guilds = [int(guild) for guild in allowed_guilds]
ValueError: invalid literal for int() with base 10: ''

I'm guessing there might be config changes since I last did an update - My config:

cooper@us:/containers$ cat /containers/gpt3discord/env 
DATA_DIR="/data"

OPENAI_TOKEN="FOO"

DISCORD_TOKEN="FOO"

ALLOWED_GUILDS="811050810460078100"
ALLOWED_ROLES="Admin,gpt"
DEBUG_GUILD="811050810460078100"
DEBUG_CHANNEL="1058174617287663689"
# This is the channel that auto-moderation alerts will be sent to
MODERATIONS_ALERT_CHANNEL="1058174617287663689"

# People with the roles in ADMIN_ROLES can use admin commands like /clear-local, and etc
ADMIN_ROLES="Server Admin,Owner,Special People"
# People with the roles in DALLE_ROLES can use commands like /dalle draw or /dalle imgoptimize
DALLE_ROLES="Server Admin,Special People,@everyone"
# People with the roles in GPT_ROLES can use commands like /gpt ask or /gpt converse
GPT_ROLES="Special People,@everyone"
WELCOME_MESSAGE="Long ass message removing for paste"

Have I missed something there? I noticed in the update I got a lot more env vars send to nothing in the docker container itself - That might be the bug here resetting my ALLOWED_GUILDS? Has anyone else had an issue upgrading?

(Sadly I don't know the version I was on but might roll back to a 9.x.x and see if i work for now)

Thanks in advance

cooperlees commented 1 year ago

Rolling back to 9.1 worked. Will hard pin there for now.

For reference 9.1 is only 322MB:

kaveenk/gpt3discord              v9.1       9f36e3abca57   2 weeks ago     322MB

cherryroots commented 1 year ago

The size is due to torch being installed which is 2gb itself and also cuda being installed for both which is a further 1.5gb. A total 3.5gb increase

cooperlees commented 1 year ago

The size is due to torch being installed which is 2gb itself and also cuda being installed for both which is a further 1.5gb. A total 3.5gb increase

That's quite a large increase to throw at people. Is there a summary I can see what including these large libraries give me? It's hard to parse https://github.com/Kav-K/GPT3Discord/compare/v9.1...V10.1 and other releases as a part time visitor here ...

Especially since I don't run this on a GPU enabled box this feels wasted space to me. Can we maybe look at a fat GPU docker container and a non GPU version?

cherryroots commented 1 year ago

Especially since I don't run this on a GPU enabled box this feels wasted space to me. Can we maybe look at a fat GPU docker container and a non GPU version?

The intention was to only install the cpu versions, so I'm not too sure why there are a bunch of cuda files that have been added too alongside the cpu models. The whole dockerfile is kind of a mess atm in general

The other error you're getting is indeed because of the empty ENV's, it was added recently for some docker compose setup, but I don't think it was tested to actually run a container with that change. They can safely be removed if you're just building normally. Not sure about with compose since I didn't add it

cherryroots commented 1 year ago

As for what torch is required for it's for use with indexing, mostly for audio indexing with whisper. Kaveen will have to pitch in with what sentence-transformers is required for in the indexing. If you absolutely wish to save on space you can remove sentence-transformer from the requirements and also remove line 28-31 in the dockerfile. It'll only raise an importerror when trying to index a video or audio file afaik.

cooperlees commented 1 year ago

We could maybe them extra installs in the pyproject.toml and publish separate containers maybe with the extra deps so people can choose via tags.

We could also make the code friendlier and catch the ImportError and just share a friendly "X won't work cause dep Y isn't installed" error maybe too.

An example of the different docker container is pypi.org's mirroring software bandersnatch. It release a fllesystem only image (65MB), a S3 version (76MB) and a Swift version (106MB) due to the dependency bloat.

Tags can be seen here: https://hub.docker.com/r/pypa/bandersnatch/tags
Docker File: https://github.com/pypa/bandersnatch/blob/main/Dockerfile

But anyways, size aside, I need to workout the change 10.X got for docker and didn't start for me, while 9.1 worked just fine. I use ansible to start my container as follows:

---
  - name: Make Container dir
    file:
      path: "{{ item }}"
      state: directory
      recurse: yes
      mode: '0775'
    loop:
      - /containers/gpt3discord/data

  - name: Copy env config
    copy:
      src: env
      dest: /containers/gpt3discord/env
      owner: cooper
      group: users
      mode: '0644'
    register: update_config

  - name: Start gpt3discord container
    docker_container:
      name: gpt3discord
      hostname: "{{ inventory_hostname_short }}-gpt3discord"
      image: kaveenk/gpt3discord:v9.1
      pull: "{{ force_docker_pull | bool }}"
      state: started
      network_mode: routable_net
      networks:
      - name: routable_net
        ipv4_address: "{{ gpt3discord_ip }}"
        ipv6_address: "{{ gpt3discord_ip6 }}"
      networks_cli_compatible: yes
      volumes:
        - /containers/gpt3discord/data:/data:rw
        - /containers/gpt3discord/env:/opt/gpt3discord/etc/environment:ro
      restart_policy: unless-stopped
      # Resource Limits
      cpu_shares: 512  # default 1024
      memory: 512m

There should be a way to support both needs here.

cherryroots commented 1 year ago

But anyways, size aside, I need to workout the change 10.X got for docker and didn't start for me, while 9.1 worked just fine. I use ansible to start my container as follows:

The breaking change is really only the extra ENV's from line 47 to line 59 in the dockerfile. I get the same error before, and it's not there after I comment them out. As I mentioned those lines were added for compose apparently in #151 and likely only tested with compose

connorv001 commented 1 year ago

Yes, I was the author. Let me fix these for you.

connorv001 commented 1 year ago

@cooperlees @Hikari-Haru

Kav-K commented 1 year ago

@connorv001 Thank you!

cooperlees commented 1 year ago

Hi Connor - Thanks for looking into this. What are you planning to do? Can we just delete the lines in the Dockerfile or does that break your compose (I've never meaningfully used docker-compose so unsure of the repercussions of deleting the env initializations in the Dockerfile causes you is all) use case? If so, I'm happy to put the PR up. I was only asking to see if could help. I just don't know how to test your specific docker-compose use case. I guess it's just using it like the documentation you added.

connorv001 commented 1 year ago

Hi Connor - Thanks for looking into this. What are you planning to do? Can we just delete the lines in the Dockerfile or does that break your compose (I've never meaningfully used docker-compose so unsure of the repercussions of deleting the env initializations in the Dockerfile causes you is all) use case? If so, I'm happy to put the PR up. I was only asking to see if could help. I just don't know how to test your specific docker-compose use case. I guess it's just using it like the documentation you added.

I'm planning to make multiple docker versions with different tags for differnt use cases. So that nothing in legacy doesn't get broken. I'll do so, if you want we can also work together if you want to add me on discord

connorv001 commented 1 year ago

Well, I tried. even using alpine look at this :

cooperlees commented 1 year ago

The base image in use (python:3.9-slim) is fairly small already. We build and copy huge fat python dependencies here, so changing base image isn't going to be helpful here.

cherryroots commented 1 year ago

Had a go at this and I think I've gotten it a lot better, have not had the time to test out the images yet, enable arm64 and just clean up a bit in general. Will try and see if I can reduce it more for the full image

Kav-K commented 1 year ago

Fixed by #169

Kav-K / GPTDiscord

[BUG] Docker env errors with 10.3.2 + Image Size concerns #155