twstokes / arris-scrape

Arris modem status page scraper that uploads signal values to InfluxDB and Grafana.
https://www.tannr.com/2021/03/22/scraping-an-arris-cable-modem-status-page/
MIT License
11 stars 7 forks source link

SB6183 index out of range #5

Closed shadowpuck99 closed 2 years ago

shadowpuck99 commented 2 years ago

Hello - I was hoping I could get a suggestion as to what I'm missing here. This is a SB6183 modem, and I'm using Option B. I had this working previously with my CM820A, but have changed modems. I started fresh with the 6183 - so, did Option B from the top, other than installing Docker - which I already had installed.

I'm guessing the script is complaining about something in my SB6183 page, but, a suggestion or thought would be super helpful.

I believe I have config.py set up correctly - I set my model as SB6183, provided the URL, etc. It looks like the modem database exists in influxdb.

Modem scraper running., Error: (<class 'IndexError'>, IndexError('list index out of range'), <traceback object at 0x7f355e6370c0>), Error: (<class 'IndexError'>, IndexError('list index out of range'), <traceback object at 0x7f355e67a4c0>), Error: (<class 'IndexError'>, IndexError('list index out of range'), <traceback object at 0x7f355e5ae940>), Error: (<class 'IndexError'>, IndexError('list index out of range'), <traceback object at 0x7f355e594440>), Error: (<class 'IndexError'>, IndexError('list index out of range'), <traceback object at 0x7f355e5251c0>), Abort! Max retries reached: 5

twstokes commented 2 years ago

👋 @shadowpuck99. This error tells me that the parsing didn't quite work as expected because the script tried to access an element in a list that didn't exist.

If you print out the stack trace for the traceback object you should get the exact line this occurred at and that'll be helpful in troubleshooting what data was fed to its caller, and so on.

Here's a list of functions you can call on that type of object.

If you still get stuck, feel free to send me the saved output of your modem's status screen and I can try to see where the parsing breaks. There shouldn't be any sensitive info on there, but if you spot some, please delete it in the HTML source first.

shadowpuck99 commented 2 years ago

@twstokes hi tanner! sorry to bug you on this type of thing again; your help with my old modem was invaluable. ok, what you are saying about it not finding an element it expected makes sense to me (to the level i understand what's happening here). i seem to notice that the fields in my status page do not match the sb6183 fields in this script. i'm not sure how to fix that completely - i did give things a quick try by changing the field names in the script but, at some point, managed to lock up the status interface of my modem! haha.

attached is my connection status page... RgConnect_asp_Output.txt .

twstokes commented 2 years ago

It looks like this line for downstream should be:

for table_row in soup.find_all("table")[1].find_all("tr")[2:]:

and this line for upstream should be:

for table_row in soup.find_all("table")[2].find_all("tr")[2:]:

In your source HTML there's a form containing a table that's commented out. On @jdburton's modem (who initially supplied this target) it might be shown, so his tables were index 2 and 3 when yours are 1 and 2.

Your scraper was processing a downstream table with upstream input, and since it has fewer columns the index went out of range.

shadowpuck99 commented 2 years ago

weird - this appears to be similar to the change i had to make for my old modem - the CM820A.

that appears to have helped!

i'm now getting down/up power and SNR, but, nothing for correcteds or uncorrectables. i think one of the things that's confusing for me - is if i'm reading the py script right, it seems like the keys aren't matching up with what's in my table....i must be misunderstanding that part!

anyway - now i need to figure out why corrected/uncorrectables aren't reporting....

twstokes commented 2 years ago

Here's what I saw with the file you shared above:

image

Which shows all zeroes for corrected and uncorrectables except for a single value of 1 on channel 12.

I added a debugger tool that may be helpful in the tools directory. To see what it'd write to InfluxDB, just:

  1. Run pip install -r requirements.txt in the root folder (if you haven't already for the project)
  2. Copy the file you shared with me and add it to the tools directory
  3. Set a config like:
debug_config = {
    'modem_model': 'SB6183',
    'modem_url': 'RgConnect_asp_Output.txt',
    'is_remote': False 
}
  1. Then in the tools directory run: python3 debugger.py

You should see an output like below, which to me matches what your sample HTML had:

Starting debugger.
{'measurement': 'downstream', 'tags': {'downstream_id': '1', 'modulation': 'QAM256'}, 'fields': {'snr': 37.0, 'dcid': 1, 'freq': 561000000.0, 'power': 4.2, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '2', 'modulation': 'QAM256'}, 'fields': {'snr': 37.1, 'dcid': 2, 'freq': 567000000.0, 'power': 4.6, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '3', 'modulation': 'QAM256'}, 'fields': {'snr': 37.0, 'dcid': 3, 'freq': 573000000.0, 'power': 4.6, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '4', 'modulation': 'QAM256'}, 'fields': {'snr': 36.9, 'dcid': 4, 'freq': 579000000.0, 'power': 4.4, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '5', 'modulation': 'QAM256'}, 'fields': {'snr': 36.8, 'dcid': 5, 'freq': 585000000.0, 'power': 3.9, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '6', 'modulation': 'QAM256'}, 'fields': {'snr': 36.7, 'dcid': 6, 'freq': 591000000.0, 'power': 3.6, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '7', 'modulation': 'QAM256'}, 'fields': {'snr': 36.6, 'dcid': 7, 'freq': 597000000.0, 'power': 3.5, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '8', 'modulation': 'QAM256'}, 'fields': {'snr': 36.6, 'dcid': 8, 'freq': 603000000.0, 'power': 3.6, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '9', 'modulation': 'QAM256'}, 'fields': {'snr': 36.4, 'dcid': 11, 'freq': 621000000.0, 'power': 3.7, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '10', 'modulation': 'QAM256'}, 'fields': {'snr': 35.3, 'dcid': 25, 'freq': 705000000.0, 'power': 2.2, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '11', 'modulation': 'QAM256'}, 'fields': {'snr': 35.3, 'dcid': 26, 'freq': 711000000.0, 'power': 2.2, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '12', 'modulation': 'QAM256'}, 'fields': {'snr': 35.2, 'dcid': 27, 'freq': 717000000.0, 'power': 2.3, 'octets': 0, 'correcteds': 1, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '13', 'modulation': 'QAM256'}, 'fields': {'snr': 35.2, 'dcid': 28, 'freq': 723000000.0, 'power': 2.5, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '14', 'modulation': 'QAM256'}, 'fields': {'snr': 35.2, 'dcid': 29, 'freq': 729000000.0, 'power': 2.5, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '15', 'modulation': 'QAM256'}, 'fields': {'snr': 35.1, 'dcid': 30, 'freq': 735000000.0, 'power': 2.6, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'downstream', 'tags': {'downstream_id': '16', 'modulation': 'QAM256'}, 'fields': {'snr': 35.1, 'dcid': 31, 'freq': 741000000.0, 'power': 2.5, 'octets': 0, 'correcteds': 0, 'uncorrectables': 0}}
{'measurement': 'upstream', 'tags': {'upstream_id': '1', 'modulation': None, 'channel_type': 'ATDMA'}, 'fields': {'ucid': 68, 'freq': 19900000.0, 'power': 43.5, 'symbol_rate': 2560}}
{'measurement': 'upstream', 'tags': {'upstream_id': '2', 'modulation': None, 'channel_type': 'ATDMA'}, 'fields': {'ucid': 67, 'freq': 23500000.0, 'power': 44.3, 'symbol_rate': 2560}}
{'measurement': 'upstream', 'tags': {'upstream_id': '3', 'modulation': None, 'channel_type': 'ATDMA'}, 'fields': {'ucid': 66, 'freq': 28700000.0, 'power': 44.8, 'symbol_rate': 5120}}
{'measurement': 'upstream', 'tags': {'upstream_id': '4', 'modulation': None, 'channel_type': 'ATDMA'}, 'fields': {'ucid': 65, 'freq': 35500000.0, 'power': 45.0, 'symbol_rate': 5120}}
shadowpuck99 commented 2 years ago

i will take a look at this - thank you. work got in the way of things yesterday. what's odd to me is that the scraper seems to have a field "octets" that i do not see in my page. the other strange thing was that my errors in grafana briefly went up based on what was on my status page.

so, if i changed the time base in grafana i saw a spike of errors on one channel (which is correct) but then in grafana it went back down to 0 after the spike - in other words, it didn't stay at the same level. so, one channel was 0 for awhile, then it was like 730 or something, but then went back down to zero even though the page still shows 730....

that's different than before.....

shadowpuck99 commented 2 years ago

i thought a screenshot might be more useful. Dashboard_050622

and, here's a current output of my status page. RgConnect_asp_Output_050622.txt

one last thing i just thought of - i change the poll interval to much longer than original default setting as this modem seems really sensitive to polling locking up the diag interface. that doesn't seem to have affected the other measurements going into grafana but maybe its doing something strange for the error scraping? just thought i'd mention that....

twstokes commented 2 years ago

what's odd to me is that the scraper seems to have a field "octets" that i do not see in my page.

This is a limitation of the scraper because it expects the field to be present. For your modem it just fills in a zero. The project just isn't flexible enough at this time to skip it.

so, if i changed the time base in grafana i saw a spike of errors on one channel (which is correct) but then in grafana it went back down to 0 after the spike - in other words, it didn't stay at the same level. so, one channel was 0 for awhile, then it was like 730 or something, but then went back down to zero even though the page still shows 730....

Hmm, that's odd. I'd check to see if it's pulling in actual zeros (when it shouldn't) or if the Grafana query itself needs tweaking. You can run a query directly from within Grafana to see the raw values its using.

twstokes commented 2 years ago

one last thing i just thought of - i change the poll interval to much longer than original default setting as this modem seems really sensitive to polling locking up the diag interface. that doesn't seem to have affected the other measurements going into grafana but maybe its doing something strange for the error scraping? just thought i'd mention that....

I did fix a pretty terrible bug a while back where the script wouldn't respect the delay - you may want to make sure have the latest.

shadowpuck99 commented 2 years ago

so, your thought on the query was spot on.....

here's the query for the downstream power, for example: SELECT mean("power") FROM "downstream" WHERE time >= now() - 1h and time <= now() GROUP BY time(5s), "downstream_id" fill(none)

here's the query for the corrected graph: SELECT non_negative_difference(last("correcteds")) FROM "downstream" WHERE time >= now() - 1h and time <= now() GROUP BY time(5s), "downstream_id" fill(none)

changing the corrected query to select mean as opposed to non_negative_difference was what i needed to do.
so, it really depends what kind of data you want to see - the non-negative will show you when the spikes occur in a (arguably) cleaner graph, but, if you want to watch the error trends similar to the power and such - changing the query works better, imo.,...

thank you again for the hints and assistance....

twstokes commented 2 years ago

That makes sense, and it's good to know. Thanks!