Closed spencer-b-318 closed 1 year ago
We discussed the suggested fix using low_memory=False and the alternative to set each of the data types on import. We decided it was best to move forward with the low_memory=False fix as the difference in performance was not noticeable when testing and this suppresses the dtype warning.
there is a third alternative solution - low_memory=False not necessarily best practice. Was testing done with full dataset? dtype does take longer but it allows you to easily spot incorrect data - just be aware why the dtype occurs (file is split into chunks for processing when low_memory=True, if one chunk has all integers for one column and another has strings with mix of numbers and letters, you will get the dtype error.
Recommendation from @lldwork is to not suppress the error, kicking it down the line. Instead to format the columns correctly (only 4 of them) using column dtype. Should be low effort.
Console output warns columns have mixed types after "Calling Naptan API to get lat/lon for each stop... " and before "Generating timetable"
Steps to reproduce the behavior:
Expect to see no warnings/errors on timetable generation
Screenshots
Desktop (please complete the following information):