HBCUMobility / datacollection

1 stars 3 forks source link

Naming procedure for TimeMap files may cause clashes #21

Closed machawk1 closed 2 years ago

machawk1 commented 2 years ago

Per Slack, the procedure to generate a filename for a TimeMap fetched based on the URL, as currently implemented, might cause clashes due to the naivete of the procedure. As implemented... https://github.com/HBCUMobility/datacollection/blob/a9720deda068958121166926336a43c3e53130cf/fetch_timemaps.py#L32-L35

https://school.edu/index.html and https://school.edu/dept/index.html would produces the same filename. This can be resolved using a hashing procedure of the URI-R as the basis for the filename.