docker.sock security risk exposure

codefaux commented 7 months ago

Hey there. Fantastic work and I'd love to test this out. I just started setting it up and ran into a roadblock.

It is strongly advised not to mount docker.sock inside a cointainer.

It's a serious security risk -- regardless of your personal code hygiene and security practices with this repository, you're relying on the security of every link in the expansive Docker distribution network to remain intact and perfectly secure. From the Docker engine, containers can gain information about mounted volumes and filesystems of every container on the daemon. New containers can be created, Docker Plugins can be installed which most users don't even know about, let alone know how to check if they're infected by one. The recent xz-utils backdoor which was pulled into a few actual Linux distributions is a great example of why security should always be tended.

When's the last time you ran docker plugin ls to see what plugins were active? How long could you have been infected a tainted container?

Making hotio's lidarr container execute a script at boot is actually designed into the Linuxserver containers -- mount your script file into /etc/cont-init.d/ on Lidarr and it will run that script on container startup.

An example 99-update-cacerts.sh;

#!/command/with-contenv bash
echo "Updating ca certificates for lidarr-deemix API mitmproxy hijack..."
/usr/sbin/update-ca-certificates

Example docker-compose.yml; ( Just wrote this by hand, didn't test it -- I'm using Unraid.)

version: "3"
services:
  lidarr:
    # hotio is used explicitly because it provides container init extensions. pr-plugins branch is required for Deemix plugin.
    # Once installed, use Lidarr>System>Plugins page to install plugin via Github URL box:
    #  https://github.com/ta264/Lidarr.Plugin.Deemix
    image: ghcr.io/hotio/lidarr:pr-plugins
    container_name: lidarr
    depends_on:
      - lidarr-deemix
    environment:
      # PUID/PGID should be same on all containers who act on eachothers' files
      - PUID=1000
      - PGID=1000
    volumes:
      # your existing volumes for /config and music sources
      - ./99-update-cacerts.sh:/etc/cont-init.d/99-update-cacerts.sh
      - ./lidarr-deemix-certs:/usr/local/share/ca-certificates # IMPORTANT - to update the ca-certificates

  lidarr-deemix:
    image: ghcr.io/ad-on-is/lidarr-deemix
    container_name: lidarr-deemix
    volumes:
      - ./lidarr-deemix-certs:/root/.mitmproxy
    environment:
      - LIDARR_CONTAINER=lidarr
      - DEEMIX_ARL=xxxx

  deemix:
    # OPTIONAL!
    # to use deemix as indexer/downloader, this works well in combination with ghcr.io/hotio/lidarr:pr-plugins
    image: codefaux/deemix-for-lidarr
    container_name: deemix
    restart: always
    environment:
      - PUID=1000
      - PGID=1000
      - UMASK=0022
      - DEEMIX_SINGLE_USER=true
    volumes:
      - ./deemix:/deemix-gui/config
      - ./downloads:/downloads

Excited to set the rest up and see how it goes.

Thanks for the work!

codefaux commented 7 months ago

Actually, I'll edit that suggestion -- making the container depend on Lidarr means Lidarr comes up first. That's problematic -- that means the proxy is offline when Lidarr starts, which means the container init won't help much. Lidarr needs to depend on the api proxy container so the api proxy container comes up first.

ad-on-is commented 7 months ago

Hey, thx for the hint, and you're right about the security issue.

I already knew this was a bad idea, but upon testing I found the http requests wouldn't sometimes work, and an update-ca-certs would fix the issue. I haven't digged deeper into it, but I assume the certs have a short expiration date, or whatever.

Additionally, I didn't want to "inject" too much into existing setups, and this was the "only" solution I came up with.

I'll ditch the docker.sock in the next version, and use your proposed init-script. Can I script it the same way it is now, like, with a loop and running in the background?

codefaux commented 7 months ago

Hey, thx for the hint

No worries, glad to help any way I can.

... upon testing I found the http requests wouldn't sometimes work, and an update-ca-certs would fix the issue.

Since the mitm proxy has to handle ssl requests, it has to decrypt, modify, and re-sign said requests. It can't sign with the original certs, so it has to use its own certs. mitmproxy has a "certificate authority" certificate built into it (the certificate which tells a client they can trust any certificate for any website provided by this source) which is provided to the client when the client runs its update-ca-certs command. I haven't been using it long enough to see the certs expire, though -- I only recently got things running.

... this was the "only" solution I came up with

I've been hacking things into Docker containers for a while -- both hotio and linuxserver containers provide extensions like this for better integration for folks like us. I wouldn't call it common knowledge, lol. No blame in not having it.

... use your proposed init-script.

I'd welcome it.

Can I script it the same way it is now, like, with a loop and running in the background?

I'm not sure what you mean by that. The script I wrote isn't a loop, and it only runs once during container init as hotio's scripts parse the folder. Are you saying you currently are looping the update-ca-certs command in your code? If the init script is written as a loop, it will never exit and not allow the container to finish starting up. A timed loop would require an additional file. Given the documentation, I was under the impression the command only had to be run once, at container init;

https://github.com/ad-on-is/lidarr-deemix/blob/eb080ca171b74ceae5d3f4cecd7e0e4b7187779e/README.md?plain=1#L44

ad-on-is commented 7 months ago

Yeah, the docs are a bit misleading.. have a look at the run.sh script which is executed as the entrypoint.

codefaux commented 7 months ago

Hmm, yes, I see. That's unfortunate. That would require something a bit more complex. For now, I'll just turn off certificate verification for Lidarr if I run into issues without the background loop. Worst case if every site Lidarr uses is source-compromised, my music metadata gets screwy and my Deezer ARL gets leaked.

Lidarr>Settings>General>Security>Certificate Validation to Disabled

Worth noting but unrelated; the Deezer API metadata profiles don't seem to respect Lidarr's Metadata Profiles. Setting a profile exclusively to Official Studio Albums, every album type shows under the Artist page. I don't know enough about the API requests/formatting to say where the problem comes from.

ad-on-is commented 7 months ago

Lidarr>Settings>General>Security>Certificate Validation to Disabled

Ooh, didn't know about that option. I think this might be even better, just to disable it.

Regarding metadata: I was hoping Lidarr would grab all the infos and deal with the selection internally later on, but looks like it doesn't.

Afaik, it works as follows:

Lidarr calls artist URL, response contains albums, eps, singles
- at this step, lidarr-demix doesn't know the "types" and just tags them as albums, which might be the culprit. maybe I've overseen something here. might have a second look
Lidarr calls each album URL individually
- here lidarr-demix knows the type

codefaux commented 7 months ago

at this step, lidarr-demix doesn't know the "types" and just tags them as albums, which might be the culprit. maybe I've overseen something here. might have a second look

I think this is worth a deeper examination.

Lidarr is able to determine that they are EP, Single, Album and sort them into categories in the UI, and its only data source for those albums is the Deezer API patch you've provided. In my mind, the difference can only happen one of three ways;

Lidarr does the "Refresh Artist" lookup and its situation executes as you described. This would mean Lidarr would then be required to run several filtered API calls to Deezer to narrow the list -- its only way to know anything about those albums is Deezer, after all. I feel it likely you would have noticed these additional passes, and if Lidarr could not get the information, it would not be able to sort them into categories in the UI, so this is unlikely to me.
Lidarr does the "Refresh Artist" lookup, and the API patch being applied is not taking into account an additional filter parameter being sent from Lidarr during the query, ie Lidarr says "I need this artist, but only Albums" and the API patch is missing the "but only Albums" part -- wether filtered by Deezer API features or the API patch itself.
Lidarr does the "Refresh Artist" lookup and its situation executes as you described. The remote metadata API reply would then have fields not being duplicated by the API patch, such as a second place to tag a release as EP/Album/Single (maybe by integer as well as string, for example)

I guess I could research the mitmproxy you're using in the mean time. Looks fun.

codefaux commented 7 months ago

Also, something just occurred to me.

You're writing an API proxy/filter to handle metadata between Lidarr and Deezer.

Deemix-gui serves, essentially, as an overblown API proxy/filter to handle Indexer searches and the Download Client role for Deezer.

Maybe we should collaborate in a better realtime environment? Telegram would be ideal, but it seems like not many folks have it...

codefaux commented 7 months ago

Hmm. Last one for the night. I just discovered that there's no way to "clean" metadata, and as such, some of our previous discussion (mostly mine) has been incorrect.

Setting your API container to override Musicbrainz API and doing a Refresh Artist will not remove Musicbrainz-provided Albums which already existed, even if no music files exist for it.

The only way to get a 100% clean Deezer-focused client is to install, from scratch, with Deezer override turned on.

......guess I'm gonna go do that. lol. I'll let you know tomorrow how it goes, but you might not need to address the issue I mentioned here

ad-on-is commented 7 months ago

You're right, with the realtime environment. I just added a TROUBLESHOOTING guidline, and added a link to the Telegram channel that I also just created.

Feel free to join!

ad-on-is commented 7 months ago

at this step, lidarr-demix doesn't know the "types" and just tags them as albums, which might be the culprit. maybe I've overseen something here. might have a second look

I was right and wrong at the same time... This step is indeed the one where Lidarr filters the Metadata profile, and record_type is in the json of deemix, I just didn't see it. I've pushed a new version and this should work now.

codefaux commented 7 months ago

I was right and wrong at the same time... This step is indeed the one where Lidarr filters the Metadata profile, and record_type is in the json of deemix, I just didn't see it. I've pushed a new version and this should work now.

Ah, fantastic! I'm glad to hear you found it.

Just pulled and tested -- looks promising so far, that does fix the issue of unrequested types showing for Artists for sure. (I wonder how many times I have to metadata scrape several hundred artists before they ban my account...lol...)

Before the update, I DID still have metadata "poisoning" from persistent Musicbrainz albums. The issue DID still persist and after the update it is fixed now. I can remove albums/etc and they disappear from the list like they should. Woo.

I did find a few instances of odd behavior regarding album matching, but I'll catch you on Telegram for that.

I've joined the channel with my dev Telegram account but am unable to post to it, I'll be around. Thanks for the great work so far!

(You're still using Docker stuff in the container so I'll leave this open for you to close at your discretion. Don't forget to remove the packages to slim the container, as well as the commands. If you do continue to use the Docker socket binding, I suggest at least testing if the socket exists before running the commands -- that way it doesn't just ping an error every few hours for no reason for users who don't bind it.)

ad-on-is commented 7 months ago

I will go with the option to not check for SSL probably.. since it's the easiest one, and doesn't require any certificate reloads, etc.

I myself am new to telegram.. amybe I did something wrong :-D

ad-on-is / lidarr-deemix

docker.sock security risk exposure #4