Anime4000 / RTL960x

Hacking RTL960x based xPON ONU Stick to suite your Universal OLT
The Unlicense
660 stars 111 forks source link

DFP-34X-2C2 not reporting some stats (temperature, Tx Bias Current, etc) to Mikrotik RB4011iGS+ #99

Closed rndm2 closed 5 months ago

rndm2 commented 1 year ago

Any chances to get it working?

Screenshot 2022-11-26 at 23 59 28

Mikrotik RB4011iGS+ DFP-34X-2C2 with M110_sfp_ODI_220923 firmware

rajkosto commented 1 year ago

Duplicate of https://github.com/Anime4000/RTL960x/issues/87 this stick only has static EEPROM linked to SFP so no, it cannot be changing in realtime use the management webgui to read the rx/tx level, RX LOSS is reported via SFP though

rndm2 commented 1 year ago

Maybe there are some options to get those stats via console? I want to add it to Zabbix monitoring

Anime4000 commented 1 year ago

Maybe there are some options to get those stats via console? I want to add it to Zabbix monitoring

You could by telnet then query by calling diag then close.

this got me thinking... if possible put basic status at login page then curl then result?

rajkosto commented 1 year ago

probably easier to remove logon check for the status page itself ?

Anime4000 commented 1 year ago

that need to patch mini webserver bin file 😆 so, modify login.asp and put basic status there

rndm2 commented 1 year ago

Heh. No vi, nano, etc on this SFP. How do you guys modify files? upd: RO file system. This wouldn't be easy

Anime4000 commented 1 year ago

just usual way, I have make some script to run firmware on local computer, that way you can modify the firmware then flash it https://github.com/Anime4000/RTL960x/tree/main/Tools/emulator

skon77 commented 1 year ago

Maybe there are some options to get those stats via console? I want to add it to Zabbix monitoring

via Telnet, e.g.:

> diag pon get transceiver bias-current Bias Current: 10.936000 mA

> diag pon get transceiver part-number Part Number: RTL8290

> diag pon get transceiver rx-power Rx Power: -24.559320 dBm

> diag pon get transceiver temperature Temperature: 85.000000 C

> diag pon get transceiver tx-power Tx Power: -14.46720 inf dBm

> diag pon get transceiver vendor-name Vendor Name: REALTEK

> diag pon get transceiver voltage Voltage: 3.019400 V

> diag gpon get onu-state ONU state: Operation State(O5)

> omcicli get onuid ONU ID: 65

> omcicli get state ONU state: 5

ToTheCLI commented 1 year ago

pon get transceiver temperature Temperature: 58.625000 C

diag pon get transceiver temperature Temperature: 85.000000 C

Your Temps seem too high!!

skon77 commented 1 year ago

This is just an example. Temperature such as of Dlink DPN-100 SFP. This is a very hot stick.

My stick: > diag pon get transceiver temperature pon get transceiver temperature Temperature: 25.078125 C

Although this is also far from reality. The temperature feels much higher on the case.

rndm2 commented 1 year ago

Where do you guys get new firmwares? Manufacturer not answering emails and don't have firmwares on website :(

skon77 commented 1 year ago

Where do you guys get new firmwares? Manufacturer not answering emails and don't have firmwares on website :(

Mostly from new stick shipments and ROM dump.

rndm2 commented 1 year ago

Fast and dirty solution to send data to Zabbix using zabbix_sender. In order to get it working you need:

  1. Expose SFP admin area port to Zabbix server, in my case it 8555 and Zabbix self-hosted
  2. Create Zabbix trapper items with corresponding names and data types (numeric float)
random@random:~/zabbix-bcn-sfp$ cat sfp-login.sh 
#!/bin/bash
curl 'http://your.hostname:8555/boaform/admin/formLogin' \
  -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
  -H 'Accept-Language: en' \
  -H 'Cache-Control: no-cache' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  -H 'DNT: 1' \
  -H 'Origin: http://10.10.100.100' \
  -H 'Pragma: no-cache' \
  -H 'Referer: http://10.10.100.100/admin/login.asp' \
  -H 'Upgrade-Insecure-Requests: 1' \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36' \
  --data-raw 'challenge=&username=admin&password=39383737373ff&save=Login&submit-url=%2Fadmin%2Flogin.asp' \
  --compressed \
  --insecure
random@random:~/zabbix-bcn-sfp$ cat sfp-data.sh 
#!/bin/bash
curl 'http://your.hostname:8555/status_pon.asp' \
  -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
  -H 'Accept-Language: en' \
  -H 'Cache-Control: no-cache' \
  -H 'Connection: keep-alive' \
  -H 'DNT: 1' \
  -H 'Pragma: no-cache' \
  -H 'Upgrade-Insecure-Requests: 1' \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36' \
  --compressed \
  --insecure
random@random:~/zabbix-bcn-sfp$ cat get-data.sh 
#!/bin/bash
./sfp-login.sh
./sfp-data.sh => data.html
awk  -F '[<>]' '/<td / { gsub(/<b>/, ""); sub(/ .*/, "", $3); print $5 } ' data.html > parsed.txt
rm ./data.html

random@random:~/zabbix-bcn-sfp$ cat send-data.sh 
#!/bin/bash
/usr/bin/zabbix_sender -z 127.0.0.1 -s "your.hostname" -k dfp34x2c2.temperature -o $(sed '7!d' parsed.txt | grep -o '[0-9.]\+')
/usr/bin/zabbix_sender -z 127.0.0.1 -s "your.hostname" -k dfp34x2c2.voltage -o $(sed '9!d' parsed.txt | grep -o '[0-9.]\+')
/usr/bin/zabbix_sender -z 127.0.0.1 -s "your.hostname" -k dfp34x2c2.tx.power -o $(sed '11!d' parsed.txt | grep -o '[0-9.]\+')
/usr/bin/zabbix_sender -z 127.0.0.1 -s "your.hostname" -k dfp34x2c2.rx.power -o $(sed '13!d' parsed.txt | grep -o '[0-9.]\+')
/usr/bin/zabbix_sender -z 127.0.0.1 -s "your.hostname" -k dfp34x2c2.bias.current -o $(sed '15!d' parsed.txt | grep -o '[0-9.]\+')

random@random:~/zabbix-bcn-sfp$ cat exec.sh 
#!/bin/bash
cd /home/random/zabbix-bcn-sfp
./get-data.sh
./send-data.sh

And put it to cron of course

nguyenthanhhong-tg commented 1 year ago

Any chances to get it working?

Screenshot 2022-11-26 at 23 59 28

Mikrotik RB4011iGS+ DFP-34X-2C2 with M110_sfp_ODI_220923 firmware

What version of your MikroTik? If you are in version 6, you will see all information, but on version, you will not see anything.

rndm2 commented 1 year ago

What version of your MikroTik? If you are in version 6, you will see all information, but on version, you will not see anything.

7.6

nguyenthanhhong-tg commented 1 year ago

What version of your MikroTik? If you are in version 6, you will see all information, but on version, you will not see anything.

7.6

Can you try to downgrade to version 6.48.6 and check again?

zentavr commented 1 year ago

@nguyenthanhhong-tg it shows nothing at 6.x and 7.x. Just tested with my router today.

nguyenthanhhong-tg commented 1 year ago

Understood your issue. I also own a RB4011iGS+RM. It is very difficult to recognize the SFP stick, especially SFP GPON stick. In some case, I must disable auto negotiation, and force speed at 1Gbps so that it can bring sfp-sfpplus1 up.

zentavr commented 1 year ago

I did not change any speed setting at the ODI stick and have Auto Negotiation with Mikrotik. Works fine so far.

pic1 pic2

smnrock commented 1 year ago

I'm not good at Mikrotik scripts, but made a dirty solution to read the status from webgui and send through telegram for every one hour. It works well for me.

#/log info "gponhealth start"
:do {
/tool fetch url="http://192.168.1.1/boaform/admin/formLogin?username=admin&password=xxxxx&save=Login&submit-url=^%^2Fadmin^%^2Flogin.asp" keep-result=no 
} on-error={ :put "login failed"};
:delay 2s
#/log info "gponhealth login"
/tool fetch url="http://192.168.1.1/status_pon.asp" keep-result=yes
#/log info "gponhealth ponstatus"
:delay 6s
:local statuscontent [/file get status_pon.asp contents];
:local statusf [:pick $statuscontent 1086 1097];
:local statusfinal ("GPON-Status\nTemperature = " .$statusf);
:set statusf [:pick $statuscontent 1218 1228];
:set statusfinal ($statusfinal . "\nVoltage = " . $statusf);
:set statusf [:pick $statuscontent 1350 1363];
:set statusfinal ($statusfinal . "\nTx Power = " . $statusf);
:set statusf [:pick $statuscontent 1485 1500];
:set statusfinal ($statusfinal . "\nRx Power = " . $statusf);
:set statusf [:pick $statuscontent 1626 1638];
:set statusfinal ($statusfinal . "\nBias Current = " . $statusf);
#/log info $statusfinal;
:local pload "chat_id=-xxxxxxxx&text=$statusfinal"
#/log info "before fetch"
/tool fetch mode=https http-method=post http-data=$pload url="https://api.telegram.org/botXXXXXXX/sendMessage" keep-result=no;
#/log info "gponhealth script `completed"

Create a scheduler for every one hour or so, it will send message to telegram.

image

Anime4000 commented 1 year ago

wow this is very nice, I add this on the README.md about reading GPON ONU SFP status.

smnrock commented 1 year ago

Thanks.

Just an additional note. This is for the people who were not well versed in to scripts, Please replace your onu password, telegram chatid and botid accordingly in the above script.

Strykar commented 10 months ago

Please report this upstream so it may be fixed instead of here, no?

Anime4000 commented 10 months ago

HSGQ/ODI still use Static EEPROM, best way is this for now @smnrock

Strykar commented 10 months ago

I wrote a Prometheus collector to grok the SFP SoC and optical signal stats via HTML and plot the data in Grafana since I use Telegram for alerting, not monitoring.

Put the collector somewhere in your $PATH and create a systemd user service and start / enable the service. The collector runs as a simple web server on port 8000, you can view what it's exporting by curl -L http://localhost:8000.

# /home/strykar/.config/systemd/user/odi.service
[Unit]
Description=Prometheus collector for the HSGQ / ODI GPON SFP
After=network-online.target

[Service]
Type=simple
ExecStart=/usr/bin/python3 /home/strykar/.bin/hsgq_prometheus_collector.py

[Install]
WantedBy=default.target

Add the collector to Prometheus under any existing scrape_configs::

  - job_name: "gpon_collector"
    scrape_interval: 5m  # Adjust this to how frequently you want Prometheus to scrape metrics
    static_configs:
      - targets: ["localhost:8000"]

Ensure Grafana can poll Prometheus and create a dashboard in Grafana: Screenshot from 2024-01-26 11-17-32

Anime4000 commented 9 months ago

Nice @Strykar !

I add this into README.md for GPON Stats via Grafana

jjeziorny commented 9 months ago

HI @Strykar , how do we handle authentication to the GPON with your collector script?

The web interface does not allow unauthenticated access, I've tried basic authentication (adding user:pass) to the URL but that doesn't work

Strykar commented 9 months ago

@jjeziorny Web auth is convoluted with MD5 hashing, I have been unable to login to the web interface via terminal. If you find and post the md5.js file used during the login process, I can try to add authentication to the current collector.

I found that it does not log me out for hours, so just signing in once per day manually for now does the trick. If login via script is not possible, I will move the script to auth over SSH and poll the data from the terminal.

smnrock commented 9 months ago

@Strykar you can try using the url format with login details appended like the one i gave in the mikrotik script above.

Strykar commented 9 months ago

@jjeziorny @smnrock Try - https://gist.github.com/Strykar/48fa636e55de0a33ff6a137ed4a09538 Let me know if it works and I'll update the repo.

jjeziorny commented 9 months ago

Seems to work fine

root@prometheus:/usr/local/etc/rc.d # /usr/local/bin/python3 /root/hsgq_prometheus_collector2.py

Logged in successfully
Strykar commented 9 months ago

@jjeziorny Cool, this needs to be refactored at some point to deal with the firmware quirks of single-login only and its related errors. SSH may be easier to deal with than HTTP (state) in Python for this.

jjeziorny commented 9 months ago

Yeah, or modify the firmware and get snmp agent on it

Strykar commented 9 months ago

If only HSGQ released source code.

jjeziorny commented 9 months ago

btw, at some point if you have the time you can add command line arguments to it so people can set IP, password and listener port without having to edit the script.

In my case I had to create two scripts as I have two GPONs in my network. image

Strykar commented 9 months ago

These two commands appear to be ODI specific and do not work with the HSGQ firmware (V1.0-220923):

# omcicli get onuid
# omcicli get state

Usage: omcicli get [cmd]

  sn        : get serial number
  log       : get runtime log level
  logfile   : get omci msg log mode/action mask
  tables    : get all registered MIB tables
  devmode   : get omci device mode
  dmmode    : get dual mgmt mode
  loid      : get loid and password
  loidauth  : get loid auth status
  cflag     : get customized flag
  authuptime    : get auth uptime

Any idea how to get this info on the HSGQ firmware via terminal? @jjeziorny Yea I'll flesh that out in the SSH version I am working on, did not think to support multiple GPONs at first.

Strykar commented 9 months ago

@jjeziorny New prometheus collector that uses SSH instead of parsing HTML - https://gist.github.com/Strykar/584c6467ed023f90b13a059f511d4d1c and can deal with multiple SFPs.

jjeziorny commented 9 months ago

@jjeziorny New prometheus collector that uses SSH instead of parsing HTML - https://gist.github.com/Strykar/584c6467ed023f90b13a059f511d4d1c and can deal with multiple SFPs.

OK, just switched to it and seems to be running fine.

# ps aux
USER         PID %CPU %MEM     VSZ    RSS TT  STAT STARTED    TIME COMMAND
root       14537  5.2  0.4   57352  32112  -  SJ   08:52   0:00.24 /usr/local/bin/python3 /root/hsgq_prometheus_collector.py --hostname 192.168.2.5 --port 22 --user admin --password xxxx --webserver-port 8001 (python3.9)
root       14530  2.2  0.4   57352  32108  -  SJ   08:52   0:00.24 /usr/local/bin/python3 /root/hsgq_prometheus_collector.py --hostname 192.168.2.1 --port 22 --user admin --password xxxx --webserver-port 8000 (python3.9)
root       14529  0.0  0.0   12816   2200  -  SsJ  08:52   0:00.00 daemon: /usr/local/bin/python3[14530] (daemon)
root       14536  0.0  0.0   12816   2200  -  SsJ  08:52   0:00.00 daemon: /usr/local/bin/python3[14537] (daemon)
Strykar commented 9 months ago

Cool, it can work via just one process BTW:

/usr/bin/python hsgq_prometheus_collector.py \
    --hostname 192.168.1.1 --port 22 --user admin --password pass1 \
    --hostname 192.168.1.2 --port 22 --user admin --password pass2
jjeziorny commented 9 months ago

Hi @Strykar , After not having much time over the weekend to tweak things I finally got a change this morning. But run into some issues.

  1. CPU/MEM stats not available on the SSH version
  2. GPON state was always comming 0
  3. SFP only allows a single SSH connection, therefore if you are SSH'ed into the device the script would fail to connect.

So, based on point 3 I don't think it's a good ideal to collect these via SSH

I've reverted back to the HTTP version for now

Strykar commented 9 months ago

There's no point in an HTTP version, here's a small example of stats that could be pulled by using SSH:

# diag gpon get ?
gpon get 
active-timer                                     - activation timer configuration
aes-framecnt                                     - AES frame counter
alarm-status                                     - current alarm status
auto-boh                                         - auto update BOH configuration
auto-tcont                                       - auto add or delete tcont configuration
bwmap                                            - bwmap
dbru-block-size                                  - dbru block size
ds-bwmap                                         - downstream BWMAP configuration
ds-eth                                           - downstream ethernet configuration
ds-flow                                          - downstream flow configuration
ds-gem                                           - downstream GEM configuration
ds-laser                                         - downstream laser configuration
ds-omci                                          - downstream OMCI configuration
ds-phy                                           - downstream PHY configuration
ds-ploam                                         - downstream PLOAM configuration
eqd-offset                                       - EQD offset configuration
multicast-filter                                 - multicast filter configuration
multicast-filter-entry                           - multicast filter entry configuration
onu-state                                        - ONU state, O1-O7
password                                         - password configuration
password-hex                                     - password configuration
pps-cnt                                          - PPS cnt information
rdi                                              - RDI configuration
rogue-sd-cnt                                     - rogue ont SD cnt information
serial-number                                    - serial number configuration
serial-number-hex                                - serial number configuration
serialnumber                                     - serial number configuration
tcont                                            - TCONT configuration
tx                                               - transmit configuration
us-dbr                                           - upstream DBR configuration
us-flow                                          - upstream flow configuration
us-laser                                         - upstream laser configuration
us-phy                                           - upstream PHY configuration
us-ploam                                         - upstream PLOAM configuration

There are a bajillion more metrics available via the terminal:

# diag ?

acl                                              - acl configuration
auto-fallback                                    - Auto-Fallback configuration
bandwidth                                        - bandwidth configuration
classf                                           - classification configuration
cpu                                              - cpu configuration
debug                                            - debug configuration
dot1x                                            - dot1x configuration
epon                                             - epon configuration
exit                                             - exit diag shell
field-selector                                   - field selector configuration
flowctrl                                         - flowctrl configuration
gpon                                             - GPON configuration
i2c                                              - I2C configuration
igmp                                             - igmp configuration
interrupt                                        - interrupt configuration
iol                                              - iol configuration
l2-table                                         - l2 table configuration
l34                                              - L34 configuration
led                                              - led configuration
meter                                            - shared meter configuration
mib                                              - mib configuration
mirror                                           - mirror configuration
oam                                              - oam  configuration
pbo                                              - pbo configuration
pon                                              - pon configuration
port                                             - port configuration
ppstod                                           - ppstod configuration
qos                                              - qos configuration
register                                         - register configuration
rldp                                             - rldp configuration
rlpp                                             - RLPP configuration
rma                                              - rma configuration
security                                         - security configuration
storm-control                                    - storm-control configuration
stp                                              - stp configuration
svlan                                            - svlan configuration
switch                                           - switch configuration
time                                             - time configuration
trap                                             - trap configuration
trunk                                            - trunk configuration
vlan                                             - vlan configuration

Use what works for now, I will keep updating the collector and dashboard as I get time since I intend to track these metrics, especially the receive signal. IIRC, the web interface restricts you to one user too.

ffxenon commented 8 months ago

Made some modification on @smnrock 's RouterOS script.

:local ponip "http://<your own PON IP>"
:local s1 1086

:local ulogin ($ponip."/boaform/admin/formLogin?username=admin&password=admin&save=Login&submit-url=^%^2Fadmin^%^2Flogin.asp")
:local ustatus ($ponip."/status_pon.asp")
:local ulogout ($ponip."/boaform/admin/formLogout")

:do {
/tool fetch url=$ulogin keep-result=no
:delay 2s
} on-error={
:delay 2s
/tool fetch url=$ulogin keep-result=no
}
:local result [/tool fetch url=$ustatus as-value output=user];
:delay 1s
:local PonTemp [:pick [:pick $result 0] $s1 (s1+2)];
/log info ("PON ----------------- Temperature = ".$PonTemp);
:local CpuTemp [/sys hea get 1 value]
/log info ("CPU ----------------- Temperature = ".$CpuTemp);
/tool fetch url=$ulogout keep-result=no
risingphoenix23 commented 8 months ago

@Strykar connect with me over email airs.prime1@gmail.com will buy the industrial hsgq ones.

longthanhtran commented 6 months ago

suddenly see my edgerouter shows details about my DFP-34X-2C2 today, not sure if it's correct or not.

image

Anime4000 commented 6 months ago

that's correct

Strykar commented 5 months ago

suddenly see my edgerouter shows details about my DFP-34X-2C2 today, not sure if it's correct or not.

image

What hardware revision is your ODI SFP?

longthanhtran commented 5 months ago

no idea, I only know its fw version, V1.0-220923

Anime4000 commented 5 months ago

no idea, I only know its fw version, V1.0-220923

Look at SFP Pin/Pad, it would write like: STICK3V06

vuducdong commented 5 months ago

no idea, I only know its fw version, V1.0-220923

Look at SFP Pin/Pad, it would write like: STICK3V06

It's stick V06.

Anime4000 commented 5 months ago

welp, need V08 to have DDM