CAIDA / pybgpstream

Python bindings for BGPStream
https://bgpstream.caida.org
BSD 2-Clause "Simplified" License
28 stars 22 forks source link

Not able to parse the MRT file from pybgpstream #51

Open ACodingfreak opened 1 year ago

ACodingfreak commented 1 year ago

Hi All,

When I use the below bgpreader command it just works, but the same with the pybgpstream code shown below is stuck

bgpreader -d singlefile -o upd-file=https://data.ris.ripe.net/rrc00/2023.01/updates.20230120.1420.gz

import pybgpstream

stream = pybgpstream.BGPStream()
stream.set_data_interface("singlefile")
stream.set_data_interface_option("singlefile", "upd-file", "https://data.ris.ripe.net/rrc00/2023.01/updates.20230120.1420.gz")

for rec in stream.records():

    # do something with rec (e.g., choose to continue based on timestamp)
    print("Received %s record at time %d from collector %s" % (rec.type, rec.time, rec.collector))
    count = 0;
    for elem in rec:
        count += 1
        print(type(elem))
        # do something with rec and/or elem

    print ("Total number of elem: %d" % (count))

It gets stuck as shown below

u2004op2:~/bgp/pybgpstream_ex$ python3 01.py 
Received update record at time 1674266321 from collector singlefile
Total number of elem: 0
^CError in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 34, in apport_excepthook
    def apport_excepthook(exc_type, exc_obj, exc_tb):
KeyboardInterrupt

Original exception was:
Traceback (most recent call last):
  File "01.py", line 11, in <module>
    for rec in stream.records():
  File "/home/ipi/.local/lib/python3.8/site-packages/pybgpstream/pybgpstream.py", line 87, in records
    _rec = self.stream.get_next_record()
RuntimeError: Could not get next record (is the stream started?)
ACodingfreak commented 1 year ago

Anything wrong with my code ?

ACodingfreak commented 1 year ago

On further troubleshooting

On using python script shown in previous messages, I have found that "populate_filter_cb" in bs_format_mrt.c file always returns BGPSTREAM_PARSEBGP_FILTER_OUT which is ending up in a loop in "bgpstream_parsebgp_populate_record" in bgpstream_parsebgp_common.c

357   if (is_wanted_time(ts_sec, format->filter_mgr) != 0) {
358     // we want this entry
359     return BGPSTREAM_PARSEBGP_KEEP;
360   } else {
361     return BGPSTREAM_PARSEBGP_FILTER_OUT;
362   }

if I use the bgpreader tool then I can see it is always returning "BGPSTREAM_PARSEBGP_KEEP".

Any inputs with respect to why we need that IF condition check as shown in above code ? @alistairking

I did try by removing the IF condition and returning "BGPSTREAM_PARSEBGP_KEEP". It works by displaying the records and elems but it ends up showing more than what available in the .bz2 file and ends up in sleep mode.