tigerblue77 / Dell_iDRAC_fan_controller_Docker

Docker image to control your Dell PowerEdge fans via IPMI
263 stars 64 forks source link

Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory #27

Closed Daktyl198 closed 2 years ago

Daktyl198 commented 2 years ago

Trying to run this container gives me two errors, the one in the title, and one saying my fan speed is invalid (despite correctly interpreting it as 15%).

I assume the first error is leading to the second. I've tested using ipmitools and they can connect to the local ipmi device just fine. Any help would be appreciated.

tigerblue77 commented 2 years ago

Hello, Can you give me the docker run command or the docker compose file that you use ?

Daktyl198 commented 2 years ago

I used the command on the README.md file here, but with my own values:

docker run -d \
  --name Dell_iDRAC_fan_controller \
  --restart=unless-stopped \
  -e IDRAC_HOST=local \
  -e FAN_SPEED=0x0f \
  -e CPU_TEMPERATURE_TRESHOLD=55 \
  -e CHECK_INTERVAL=60 \
  tigerblue77/dell_idrac_fan_controller:latest

I never tried the docker-compose, as it didn't seem like starting the container was the issue, only connecting to the local iDRAC devices.

I ended up writing my own script using ipmitools directly and putting it in cron and it works.

tigerblue77 commented 2 years ago

I think it's just a typo that crept into the README. On your Dell PowerEdge Docker host, check your IPMI interface name using :

ls /dev/ipmi*

Then adapt the IPMI interface name, if necessary, in the following docker run command (copied from yours) and try it :

docker run -d \
  --name Dell_iDRAC_fan_controller \
  --restart=unless-stopped \
  -e IDRAC_HOST=local \
  -e FAN_SPEED=0x0f \
  -e CPU_TEMPERATURE_TRESHOLD=55 \
  -e CHECK_INTERVAL=60 \
  --device=/dev/ipmi0:/dev/ipmi0:r \
  tigerblue77/dell_idrac_fan_controller:latest

Tell me if it works so that I update the README file.

Daktyl198 commented 2 years ago

Same error.

Idrac/IPMI host: local
Fan speed objective: 15%
CPU temperature treshold: 55°C
Check interval: 60s

Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
/Dell_iDRAC_fan_controller.sh: line 68: [: -gt: unary operator expected
/Dell_iDRAC_fan_controller.sh: line 69: [: -gt: unary operator expected
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
                   ------- Temperatures -------
   Date & time     Inlet  CPU 1  CPU 2  Exhaust          Active fan speed profile          Comment
/Dell_iDRAC_fan_controller.sh: line 113: printf: User static fan control profile (15%): invalid number
/Dell_iDRAC_fan_controller.sh: line 113: printf: CPU temperature decreased and is now OK (<= 55°C). User's fan control profile applied.: invalid number
20-08-22 19:45:51    0°C    0°C    0°C      0°C

My device is at /dev/ipmi0, and like I said I currently have a script utilizing ipmitools directly and it works, so I'm not sure what the problem is.

tigerblue77 commented 2 years ago

I think the IPMI interface is not made available inside your container. I can suggest you two tests:

and/or

Daktyl198 commented 2 years ago

Yeah, I can confirm that it's a problem with accessing the ipmi device from within the docker container. I'm not too worried about it, it's something to look into later. Thanks for the help.

tigerblue77 commented 2 years ago

Okay, glad that it helped. Don't hestitate to tell here if you have any other information. Also, can you confirm that my first fix with --device=/dev/ipmi0:/dev/ipmi0:r is needed ?

Daktyl198 commented 2 years ago

That fix didn't work. Running the script outside of the container worked. It's probably a problem with my setup more than with your container or script. I've not done any real configuration on this server beyond throwing Ubuntu on it for some game servers.

Daktyl198 commented 2 years ago

Btw, I decided to take another shot at this and use docker-compose (which I hadn't tried before) and it doesn't give me the device error anymore.

It does still give me the bad value fan percentage error though:

sudo docker logs d102ce820927
Idrac/IPMI host: local
Fan speed objective: 15%
CPU temperature treshold: 55°C
Check interval: 60s

                   ------- Temperatures -------
   Date & time     Inlet  CPU 1  CPU 2  Exhaust          Active fan speed profile          Comment
22-08-22 20:55:31   22°C   41°C   39°C      0°C  CPU temperature decreased and is now OK (<= 55°C). User's fan control profile applied.
/Dell_iDRAC_fan_controller.sh: line 113: printf: User static fan control profile (15%): invalid number

Seems to be a printing error only, as the actual fan speed works properly.

tigerblue77 commented 2 years ago

Which Dell PowerEdge server are you running ? I suppose that you don't have any exhaust temperature sensor on it (that's why the value is "0°C") and probably the printf error root cause.

Can you add echo $EXHAUST_TEMPERATURE between lines 112-113 in Dell_iDRAC_fan_controller.sh and give me the output so that I can propose a fix for your case ?

tigerblue77 commented 2 years ago

Okay, glad that it helped. Don't hestitate to tell here if you have any other information. Also, can you confirm that my first fix with --device=/dev/ipmi0:/dev/ipmi0:r is needed ?

I had time to investigate about this fix and found what was wrong : my script needs both read and write permissions on IPMI device so :r was not sufficient. I replaced with :rw and fixed the README file. Thanks. The docker run command in README file is now working (don't hesitate to test and confirm).

I'll just wait your feedback about the second error that you encounter.

Daktyl198 commented 2 years ago

Which Dell PowerEdge server are you running ? I suppose that you don't have any exhaust temperature sensor on it (that's why the value is "0°C") and probably the printf error root cause.

Can you add echo $EXHAUST_TEMPERATURE between lines 112-113 in Dell_iDRAC_fan_controller.sh and give me the output so that I can propose a fix for your case ?

I am running a PowerEdge R420. No need to add echos to the script, I can confirm with ipmitool that my server doesn't report an exhaust temperature. Since the grep Exhaust doesn't return anything, the $EXHAUST_TEMPERATURE variable is null, or an empty string. However bash handles it.

Enohriel commented 2 years ago

Same issue, the only temp my r710 ipmi can read is Ambient. Getting temps with lm-sensors would be great !

tigerblue77 commented 2 years ago

Which Dell PowerEdge server are you running ? I suppose that you don't have any exhaust temperature sensor on it (that's why the value is "0°C") and probably the printf error root cause. Can you add echo $EXHAUST_TEMPERATURE between lines 112-113 in Dell_iDRAC_fan_controller.sh and give me the output so that I can propose a fix for your case ?

I am running a PowerEdge R420. No need to add echos to the script, I can confirm with ipmitool that my server doesn't report an exhaust temperature. Since the grep Exhaust doesn't return anything, the $EXHAUST_TEMPERATURE variable is null, or an empty string. However bash handles it.

Okay so you are talking about another issue which is already known. Please check this Github issue which I already worked on and corresponds to your problem. I'm just waiting for some people to test my fix before merging it.

tigerblue77 commented 2 years ago

Same issue, the only temp my r710 ipmi can read is Ambient. Getting temps with lm-sensors would be great !

Same answer as previous message, please check this Github issue which I already worked on and corresponds to your problem. I'm just waiting for some people to test my fix in PR https://github.com/tigerblue77/Dell_iDRAC_fan_controller_Docker/pull/40 before merging it.

About using lm-sensors, there is also an opened issue about it but I didn't have time to work on it yet. Feel free to contribute ! Thanks.