Hurricos / realtek-poe

5 stars 10 forks source link

realtek-poe: Add PSE ID quirk for Zyxel GS1900-24HP-v1 #6

Closed mrnuke closed 2 years ago

mrnuke commented 2 years ago

The "set global power budget" command contains a PSE controller in its argument list. This was hardcoded to 0. On some switches, commands to PSE 0 do nothing. In those cases, the set budget command won't stick. The power budget remains at its default value, usually 0. Because of this, ports remain stuck in the "Requesting power" state.

To solve this, send the budget command to the correct PSE ID or set of IDs. Use a mask to encode which PSE IDs neet the set budget command. When the "compatible" devicetree string matches Zyxel GS1900-24HP-v1, set the mask to 0xff, to command PS ID's 0->7. The vendor firmware also uses ID's 0->7. This enables power delivery to PoE sinks.

image

Signed-off-by: Alexandru Gagniuc mr.nuke.me@gmail.com

Hurricos commented 2 years ago

Tested CI#15 on my GS1900-24HPv1:

A WS-AP3825i plugged into port 1 powered on as soon as the device did.

Note that issuing reboot after this does shut off PoE devices. Might be a problem resulting from how we're hogging a GPIO line to bring the device up, not sure.

Tested-by: Martin Kennedy hurricos@gmail.com

mrnuke commented 2 years ago

The gpio-hogdefinition looks fine to me. I wonder if u-boot is forcing the GPIO low.

Hurricos commented 2 years ago

To add to my tested-by: I am noticing (though I have not A/B'd to verify this changeset / ddd72d2cd929723f053102d8bfb132e13946ea6e is responsible) that realtek-poe now sees certain ports as 'disabled':

                "lan1": {
                        "priority": 2,
                        "mode": "PoE",
                        "status": "Searching"
                },
                "lan2": {
                        "priority": 2,
                        "mode": "PoE",
                        "status": "Delivering power",
                        "consumption": 5.100000
                },
                "lan3": {
                        "priority": 2,
                        "mode": "PoE",
                        "status": "Searching"
                },
                "lan4": {
                        "priority": 2,
                        "mode": "PoE",
                        "status": "Searching"
                },
                "lan5": {
                        "priority": 2,
                        "mode": "PoE",
                        "status": "Disabled"
                },
                "lan6": {
                        "priority": 2,
                        "mode": "PoE",
                        "status": "Searching"
                },

In fact, ports 9-24 all show as disabled (as does 5). I think this might be a race-condition of some sort -- I haven't traced it to see what it does, but ubus call poe reload actually fixes this and starts showing (through ubus call poe info) the ports being in the state I expect the config to have set them.

To add to this: I moved the PoE sink to port 23, found it come up; issued /etc/init.d/poe restart, found it stayed down and reported as Disabled; then issued an /etc/init.d/poe reload and found it working now.

I haven't tried applying just the functional / non-refactoring changes to test whether this is a regression from one of the commits in #4, and I bet the new functional commits would not apply without your cleanup. Just giving you a heads up that /etc/init.d/poe reload is needed after /etc/init.d/poe restart in order for all configured ports to come up.

I can tweak /etc/init.d/poe to use realtek-poe -d, to pass you a dump of the frames being sent here, if you'd like.

mrnuke commented 2 years ago

No need to tweak /etc/init.d/poe, but do make sure you're not running two instances of realtek-poe at the same time:

# /etc/init.d/poe stop
# ps | grep realtek-poe
7939 root      1308 S    grep realtek-poe
# realtek-poe -d
Hurricos commented 2 years ago

No need to tweak /etc/init.d/poe, but do make sure you're not running two instances of realtek-poe at the same time:

# /etc/init.d/poe stop
# ps | grep realtek-poe
7939 root      1308 S    grep realtek-poe
# realtek-poe -d

I thought I needed to tweak it, as /etc/init.d/poe reload expects (?) the daemon to be initialized with procd.

ubus call poe reload at one point acted differently from /etc/init.d/poe reload, not 'unsticking' the output of ubus call poe info, but I can't reproduce that so I suspect it's an unrelated bug.

mrnuke commented 2 years ago

I'm not surprised. realtek-poe is propped up by 2x4s and held together with duct tape.

First milestone is to get it to power on the ports. That makes 99% of users happy. Then we can worry about the undefined behavior and deadlocks that I found.

walterav1984 commented 2 years ago

While still using CI14 realtek-poe package I can confirm that just a ubus call poe reload changes the port status for ports 9-24 from Disabled to Searching and after running ubus call poe sendframe '{"frame" : "18 00 07 07 08 00 00"}' its shows status Delivering power and starts powering PoE devices on port 9-24 :smiley:!

ubus call poe info
        "lan9": {
            "priority": 2,
            "mode": "PoE+",
            "status": "Delivering power",
            "consumption": 2.400000
        },
...
        "lan24": {
            "priority": 2,
            "mode": "PoE+",
            "status": "Delivering power",
            "consumption": 4.600000
        }

Reconnecting PoE devices on other 9-24 ports will power those devices on although the ubus call poe info will not always show the correct status after that.

Doing a /etc/init.d/poe restart will just flat out put all status back to Disabled on ports 9-24 and will disable PoE and poweroff all devices on these ports. Contrary this service restarts works just fine for ports 1-8. Reapplying a ubus call poe reload will bring back PoE devices status and power on ports 9-24.

EDIT: Also doing a reboot while ports 9-24 were active before with PoE, just quickly powers off and powers on the PoE devices during boot but eventually PoE power goes and stays down for port 9-24. However port status now shows a mix of Disabled and unknown after reboot.

{
    "firmware": "v17.1",
    "mcu": "ST Micro ST32F100 Microcontroller",
    "budget": 170.000000,
    "consumption": 0.000000,
    "ports": {
        "lan1": {
            "priority": 2,
            "status": "Searching"
        },
        "lan2": {
            "priority": 2,
            "status": "Searching"
        },
        "lan3": {
            "priority": 2,
            "mode": "PoE+",
            "status": "Searching"
        },
        "lan4": {
            "priority": 2,
            "status": "Searching"
        },
        "lan5": {
            "priority": 2,
            "mode": "PoE+",
            "status": "Searching"
        },
        "lan6": {
            "priority": 2,
            "mode": "PoE+",
            "status": "Searching"
        },
        "lan7": {
            "priority": 2,
            "mode": "PoE+",
            "status": "Disabled"   <--- Also notice this strange Disabled behaviour while not disabled in config
        },
        "lan8": {
            "priority": 2,
            "mode": "PoE+",
            "status": "Disabled"   <--- Also notice this strange Disabled behaviour while not disabled in config
        },
        "lan9": {
            "priority": 2,
            "mode": "PoE+",
            "status": "Disabled"   <--- Poe device connected
        },
        "lan10": {
            "priority": 2,
            "mode": "PoE+",
            "status": "Disabled"
        },
        "lan11": {
            "priority": 2,
            "status": "Disabled"
        },
        "lan12": {
            "priority": 2,
            "status": "Disabled"
        },
        "lan13": {
            "priority": 2,
            "status": "Disabled"
        },
        "lan14": {
            "priority": 2,
            "status": "Disabled"
        },
        "lan15": {
            "priority": 2,
            "status": "Disabled"
        },
        "lan16": {
            "priority": 2,
            "status": "Disabled"
        },
        "lan17": {
            "priority": 2,
            "status": "unknown"
        },
        "lan18": {
            "priority": 2,
            "status": "unknown"
        },
        "lan19": {
            "priority": 2,
            "status": "unknown"
        },
        "lan20": {
            "priority": 2,
            "status": "unknown"
        },
        "lan21": {
            "priority": 2,
            "status": "unknown"
        },
        "lan22": {
            "priority": 2,
            "status": "unknown"
        },
        "lan23": {
            "priority": 2,
            "status": "unknown"
        },
        "lan24": {
            "priority": 2,
            "status": "unknown"   <--- PoE device connected
        }
    }
}

ubus call poe reload #will poweron again devices 9-24 after reboot, this is not necesarry for ports 1-8
#status is still incorrect although PoE works again
#to get poe and status info correct it seems you always need a combination of `/etc/init.d/poe restart && ubus  call poe reload`

I can also confirm @Hurricos that there is strange behaviour concerning ports 5,6,7, they sometimes ran into a Disabled state after reconnecting PoE device for to many times. Although the /etc/config/poe config file lists the ports enabled it will refuse to power on PoE devices, even after rebooting the Switch nor after /etc/init.d/poe restart the PoE devices won't come on these ports and status stays disabled! However, doing a "real" disabled in /etc/config/poe config file for instance port 5 and restarting the service and than doing a "real" enable again for that same port in config file and doing a service restart makes the port status and PoE work again...

walterav1984 commented 2 years ago

Maybe an interresting but logical find is that disabling /etc/init.d/poe start at boot keeps ports 9-24 powered up with PoE "after reboot" albeit with the short interuption mentioned earlier.

mrnuke commented 2 years ago

Would be useful to have realtek-poe -d logs when the issue occurs.

Hurricos commented 2 years ago

Would be useful to have realtek-poe -d logs when the issue occurs.

Here's a .tar.gz of it: https://paste.c-net.org/LeelooLucid

1) realtek-poe -d > /tmp/realtek-poe-d.log 2>&1 was run standalone (poe service disabled) 2) ubus call poe info > /tmp/ubus-call-poe-info-1 was run 3) ubus call poe reload was run 4) ubus call poe info > /tmp/ubus-call-poe-info-2 ... 5) ubus call poe reload ... 6) ubus call poe info > /tmp/ubus-call-poe-info-3 ...

Each of these was run with about 15s between then.

Hurricos commented 2 years ago

.tar.gz of it

One more thing -- that run didn't show the Disabled values I expected. This one did (raw log file): https://paste.c-net.org/AlliancePackard

mrnuke commented 2 years ago

Seems to be the same root cause as https://github.com/Hurricos/realtek-poe/issues/9#issuecomment-1169107327

Hurricos commented 2 years ago

Seems to be the same root cause as #9 (comment)

OK. Looks like it's a separate issue unrelated to this PR. Some next steps are outlined here.

I am now merging this. Thank you @mrnuke!