datasciencecampus / transport-network-performance

Measuring the performance of transport networks around urban centres
https://datasciencecampus.github.io/transport-network-performance/
MIT License
12 stars 0 forks source link

html_report KeyError #262

Open r-leyshon opened 2 months ago

r-leyshon commented 2 months ago

Originally identified by @SergioRec in #248, context below. To reproduce the error, I used GtfsInstance on the chester test fixture. Using html_report(extended_validation=True, clean_feed=False) should trigger the KeyError: 'multiple_stops_invalid'.

A short-term fox sidestep (kicking the can down the road) could be to toggle the default value for extended_validation to False.

Original context below:

          Here's the code I ran:
# %%
from transport_performance.gtfs.multi_validation import MultiGtfsInstance
from pyprojroot import here

# %%
t = MultiGtfsInstance(here('data/interim/gtfs/itm_leeds_filtered_gtfs.zip'))
s = t.instances[0]
# %%
s.html_report(overwrite=True, clean_feed=False)
# %%

Here's the full traceback:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[3], [line 2](vscode-notebook-cell:?execution_count=3&line=2)
      [1](vscode-notebook-cell:?execution_count=3&line=1) # %%
----> [2](vscode-notebook-cell:?execution_count=3&line=2) s.html_report(overwrite=True, clean_feed=False)

File [~/src/transport_performance/gtfs/validation.py:1502](~//src/transport_performance/gtfs/validation.py:1502), in GtfsInstance.html_report(self, report_dir, overwrite, summary_type, extended_validation, clean_feed)
   [1500](~/src/transport_performance/gtfs/validation.py:1500) # create extended reports if requested
   [1501](~/src/transport_performance/gtfs/validation.py:1501) if extended_validation:
-> [1502](~/src/transport_performance/gtfs/validation.py:1502)     self._extended_validation(output_path=report_dir)
   [1503](~/src/transport_performance/gtfs/validation.py:1503)     info_href = (
   [1504](~/src/transport_performance/gtfs/validation.py:1504)         validation_dataframe["message"].apply(
   [1505](~/src/transport_performance/gtfs/validation.py:1505)             lambda x: "_".join(x.split(" "))
   (...)
   [1509](~/src/transport_performance/gtfs/validation.py:1509)         + ".html"
   [1510](~/src/transport_performance/gtfs/validation.py:1510)     )
   [1511](~/src/transport_performance/gtfs/validation.py:1511)     validation_dataframe["info"] = [
   [1512](~/src/transport_performance/gtfs/validation.py:1512)         f"""<a href="{href}"> Further Info</a>"""
   [1513](~/src/transport_performance/gtfs/validation.py:1513)         if len(rows) > 1
   [1514](~/src/transport_performance/gtfs/validation.py:1514)         else "Unavailable"
   [1515](~/src/transport_performance/gtfs/validation.py:1515)         for href, rows in zip(info_href, validation_dataframe["rows"])
   [1516](~/src/transport_performance/gtfs/validation.py:1516)     ]

File [~/src/transport_performance/gtfs/validation.py:1376](~/src/transport_performance/gtfs/validation.py:1376), in GtfsInstance._extended_validation(self, output_path, scheme)
   [1371](~/src/transport_performance/gtfs/validation.py:1371)         duplicate_counts[col] = impacted_rows[
   [1372](~/src/transport_performance/gtfs/validation.py:1372)             impacted_rows[f"{col}_original"]
   [1373](~/src/transport_performance/gtfs/validation.py:1373)             == impacted_rows[f"{col}_duplicate"]
   [1374](~/src/transport_performance/gtfs/validation.py:1374)         ].shape[0]
   [1375](~/src/transport_performance/gtfs/validation.py:1375) else:
-> [1376](~/src/transport_performance/gtfs/validation.py:1376)     impacted_rows = table_map[table].copy().iloc[rows]
   [1378](~/src/transport_performance/gtfs/validation.py:1378) # create the html to display the impacted rows (clean possibly)
   [1379](~/src/transport_performance/gtfs/validation.py:1379) table_html = f"""
   [1380](~/src/transport_performance/gtfs/validation.py:1380) <head>
   [1381](~/src/transport_performance/gtfs/validation.py:1381)     <link rel="stylesheet" href="styles.css">
   (...)
   [1390](~/src/transport_performance/gtfs/validation.py:1390)             {msg_type}</span>
   [1391](~/src/transport_performance/gtfs/validation.py:1391) </h1>"""

KeyError: 'multiple_stops_invalid'

Originally posted by @SergioRec in https://github.com/datasciencecampus/transport-network-performance/issues/248#issuecomment-1959528482

CBROWN-ONS commented 2 months ago

I'm pretty sure this is fixed in one of the open PRs, it just hasn't been implemented since the PR has not been merged.

r-leyshon commented 2 months ago

I'm pretty sure this is fixed in one of the open PRs, it just hasn't been implemented since the PR has not been merged.

Thanks Charlie, once that backlog has been integrated with the package, we can close this out.