SichangHe / internet_route_verification

RPSLyzer: Parse Routing Policy Specification Language from IRR and compare BGP routes against it
MIT License
1 stars 0 forks source link

Loosen routes check on potentials scenarios where the AS posts incomplete RPSL #19

Closed SichangHe closed 6 months ago

SichangHe commented 1 year ago

As @cunha and I talked about today:

SichangHe commented 1 year ago

Neighbors vs rules stats

-1 for neighbor if no data in AS relationship DB.

Polars output in Evcxr. ```rust shape: (95_911, 4) ┌─────────┬──────────┬────────┬────────┐ │ aut_num ┆ neighbor ┆ import ┆ export │ │ --- ┆ --- ┆ --- ┆ --- │ │ u64 ┆ i32 ┆ u32 ┆ u32 │ ╞═════════╪══════════╪════════╪════════╡ │ 202125 ┆ 3 ┆ 2 ┆ 2 │ │ 266498 ┆ 29 ┆ 4 ┆ 3 │ │ 132756 ┆ -1 ┆ 0 ┆ 0 │ │ 41937 ┆ 13 ┆ 8 ┆ 8 │ │ … ┆ … ┆ … ┆ … │ │ 201276 ┆ -1 ┆ 2 ┆ 2 │ │ 32284 ┆ 2 ┆ 0 ┆ 0 │ │ 399684 ┆ 3 ┆ 0 ┆ 0 │ │ 136615 ┆ 1 ┆ 0 ┆ 0 │ └─────────┴──────────┴────────┴────────┘ shape: (9, 5) ┌────────────┬───────────────┬────────────┬──────────┬───────────┐ │ describe ┆ aut_num ┆ neighbor ┆ import ┆ export │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ f64 ┆ f64 ┆ f64 │ ╞════════════╪═══════════════╪════════════╪══════════╪═══════════╡ │ count ┆ 95911.0 ┆ 95911.0 ┆ 95911.0 ┆ 95911.0 │ │ null_count ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │ │ mean ┆ 125671.693403 ┆ 10.15602 ┆ 4.342171 ┆ 4.227722 │ │ std ┆ 112437.127515 ┆ 102.243572 ┆ 52.40494 ┆ 48.594727 │ │ min ┆ 1.0 ┆ -1.0 ┆ 0.0 ┆ 0.0 │ │ 25% ┆ 34821.0 ┆ 1.0 ┆ 0.0 ┆ 0.0 │ │ 50% ┆ 62122.0 ┆ 1.0 ┆ 1.0 ┆ 1.0 │ │ 75% ┆ 205279.5 ┆ 3.0 ┆ 2.0 ┆ 2.0 │ │ max ┆ 6.131644e6 ┆ 9628.0 ┆ 5724.0 ┆ 5344.0 │ └────────────┴───────────────┴────────────┴──────────┴───────────┘ ```

as_neighbors_vs_rules.csv

SichangHe commented 1 year ago

as_neighbors_vs_rules as_neighbors_vs_rules_zoom

Code used. ```python import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv("as_neighbors_vs_rules.csv") plt.figure(figsize=(15, 15)) plt.scatter(df['neighbor'], df['import'], color='blue', alpha=0.3, s=2, label='Number of Import') plt.scatter(df['neighbor'], df['export'], color='red', alpha=0.3, s=2, label='Number of Export') plt.xlabel('Neighbor') plt.ylabel('Counts') plt.title('Neighbor Counts vs Import/Export Counts for Autonomous Systems') plt.legend() plt.grid(True) plt.show() ```
SichangHe commented 1 year ago
With just import count. ![as_neighbors_vs_imports_zoom](https://github.com/SichangHe/internet_route_verification/assets/84777573/af9707cd-35eb-469b-87d1-6802934e74a6)
With just export count. ![as_neighbors_vs_exports_zoom](https://github.com/SichangHe/internet_route_verification/assets/84777573/490f3f82-783c-40cd-a027-bf36f5fac971)
SichangHe commented 1 year ago
It seems that there is no correlation between neighbor counts and import/export counts. ```python In [19]: clean = df.loc[df['neighbor'] != -1] In [20]: clean Out[20]: aut_num neighbor import export 0 202125 3 2 2 1 266498 29 4 3 3 41937 13 8 8 4 396741 1 0 0 5 138737 11 0 0 ... ... ... ... ... 95904 201318 2 3 3 95905 17759 1 3 3 95908 32284 2 0 0 95909 399684 3 0 0 95910 136615 1 0 0 [75443 rows x 4 columns] In [21]: clean['neighbor'].corr(clean['import']) Out[21]: 0.3318664482728362 In [22]: clean['neighbor'].corr(clean['export']) Out[22]: 0.3184645567107286 ```
SichangHe commented 1 year ago
Anomalies base on arbitrary constraints. ```python In [23]: anomalies = df[(df['import'] > 4000) | (df['export'] > 4000) | (df['neighbor'] > 8000)] In [24]: anomalies Out[24]: aut_num neighbor import export 49709 3257 2781 4780 4780 65705 6695 -1 5226 5226 86320 6939 9628 1 2 88938 3356 6530 5706 5344 91196 1299 2355 5724 4 In [25]: really_naughty_boys = df[(df['neighbor'] > 2000) & (df['import'] > 0) & (df['export'] > 0) & (df['import'] < 10) & (df['export'] < 10)] In [26]: really_naughty_boys Out[26]: aut_num neighbor import export 9072 39351 2541 9 9 26077 1828 5729 1 2 27487 174 6672 1 1 30852 51185 3029 1 1 48126 36351 2414 2 3 50696 52873 2123 5 5 51320 12779 2072 4 4 64441 18106 2091 4 4 76935 24482 6481 3 4 81599 212483 2841 9 9 81746 132337 2134 1 1 82456 17639 2155 8 7 86320 6939 9628 1 2 In [30]: over_engineered = df[(df['neighbor'] > -1) & (df['neighbor'] < 11) & ((df['import'] > 500) | (df['export'] > 500))] In [31]: over_engineered Out[31]: aut_num neighbor import export 1906 8928 5 1185 1179 8820 16150 4 662 662 13875 8897 2 569 569 20460 62047 2 637 637 40133 5577 5 1035 1035 42363 44654 10 1198 1198 81610 4589 5 2319 2319 ```
cunha commented 1 year ago

Yeah, I like those classes.

What I was thinking for the really naughty boys was something like: df[(df['neighbor'] > 250) & (df['import'] > 0) & (df['export'] > 0) & (df['import'] + df['export'] < 25)], which is pretty close to what you have. I think we could try this out.

SichangHe commented 1 year ago

What I was thinking for the really naughty boys was something like: df[(df['neighbor'] > 250) & (df['import'] > 0) & (df['export'] > 0) & (df['import'] + df['export'] < 25)], which is pretty close to what you have.

There are 122 of them. ```python In [11]: with pd.option_context('display.max_rows', None): ...: print(df[(df['neighbor'] > 250) & (df['import'] > 0) & (df['export'] > 0) & (df['import'] + df['export'] < 25)]) ...: aut_num neighbor import export 114 13414 259 1 1 634 14061 1778 1 1 1941 41805 1537 1 1 3650 1403 255 5 4 5263 35266 689 8 8 5669 24961 1751 2 2 6136 4764 500 4 3 7497 54113 341 2 2 9072 39351 2541 9 9 9812 57111 309 4 4 12200 18403 380 2 2 12415 50629 1741 3 5 13333 8660 347 4 6 13519 205206 412 2 2 14318 54825 277 1 1 14585 12637 1052 4 4 14972 44444 270 1 1 15367 20473 759 1 1 15561 4181 596 1 1 17861 4766 592 2 2 18224 9299 357 11 1 18738 9498 1227 11 11 19309 7552 297 5 5 19941 396986 302 1 1 20280 138915 257 1 1 20424 201054 366 2 2 21271 263508 1924 1 1 22296 209102 1115 6 6 22581 15576 753 6 6 22921 57777 986 2 2 23219 12400 259 11 10 23332 3214 1530 2 2 23385 15547 1312 3 3 25006 148968 611 6 6 25581 21859 426 4 4 26077 1828 5729 1 2 26529 4637 752 3 3 26822 25291 1460 10 10 27487 174 6672 1 1 28114 196610 430 12 12 28616 49709 367 5 5 29453 23947 293 6 7 29890 23473 303 1 1 30158 17557 304 4 17 30852 51185 3029 1 1 31374 37100 1103 8 11 31889 5645 319 3 3 31930 4775 344 1 1 32313 9505 300 1 1 33318 328320 337 6 6 33401 9318 562 5 5 33429 14630 1518 6 6 33523 38158 283 2 2 35260 4800 317 12 4 35491 58453 589 2 2 36535 5617 298 2 2 36545 9009 438 9 9 38356 64475 1181 2 2 39023 45352 1745 6 7 39672 44103 483 7 7 39776 55256 345 1 1 40595 201333 324 6 6 41745 3170 613 4 4 42479 3786 423 2 2 42706 14840 482 1 1 43058 29838 332 1 1 43225 59605 1317 11 5 43554 58308 401 3 3 44855 34549 1985 2 2 45406 6894 444 10 10 45445 33891 1816 3 4 46834 51088 863 10 10 47075 212828 354 8 8 47274 202365 426 1 1 47708 13030 899 2 2 47925 31027 337 5 5 48126 36351 2414 2 3 50696 52873 2123 5 5 51082 205112 991 2 2 51320 12779 2072 4 4 52298 1031 1673 1 1 53476 55685 275 9 1 54851 53828 703 3 3 55246 396998 1099 1 1 56827 714 317 6 6 58799 6774 1183 7 5 59479 9304 1287 12 12 60330 4755 788 1 1 60586 21574 260 15 3 62331 4844 337 2 2 62340 49697 992 11 13 64441 18106 2091 4 4 65634 35360 1180 3 3 65692 40934 300 1 1 65977 8888 1324 2 2 66782 28329 292 6 6 67305 204092 345 12 12 68210 53062 352 8 8 68450 3130 313 5 3 69458 15169 377 4 1 69957 49102 386 11 11 72389 4788 303 8 8 73647 60068 631 10 11 75484 16347 1789 6 6 75657 16509 349 1 1 76935 24482 6481 3 4 77409 9902 406 2 2 77757 23673 563 2 2 77799 207841 647 1 1 79027 3320 700 2 4 81599 212483 2841 9 9 81746 132337 2134 1 1 82456 17639 2155 8 7 83926 32934 380 1 1 86320 6939 9628 1 2 86815 8315 288 2 4 88733 50877 548 9 11 90405 201053 360 2 2 91253 30081 279 2 2 93249 45899 315 3 3 93656 41666 518 6 6 95349 34854 1265 4 4 In [12]: df[(df['neighbor'] > 250) & (df['import'] > 0) & (df['export'] > 0) & (df['import'] + df['export'] < 25)].describe() Out[12]: aut_num neighbor import export count 122.000000 122.000000 122.000000 122.000000 mean 56210.696721 1021.959016 4.393443 4.237705 std 79934.322426 1327.329790 3.474957 3.435482 min 174.000000 255.000000 1.000000 1.000000 25% 9307.500000 337.000000 2.000000 2.000000 50% 26810.000000 533.000000 3.000000 3.000000 75% 53636.500000 1255.500000 6.000000 6.000000 max 396998.000000 9628.000000 15.000000 17.000000 ```