ITSLeeds / UK2GTFS

Convert UK transport data (TransXchange / ATOC CIF) to GTFS format in R
https://itsleeds.github.io/UK2GTFS/
GNU General Public License v3.0
37 stars 13 forks source link

TxC - HolidaysOnly? #35

Closed stupidpupil closed 2 years ago

stupidpupil commented 3 years ago

The Trawscymru T2 (TNDS, OGLv3) currently has the following VehicleJourney

    <VehicleJourney SequenceNumber="3">
      <PrivateCode>0T2BHMGZ4:I:3</PrivateCode>
      <OperatingProfile>
        <RegularDayType>
          <HolidaysOnly />
        </RegularDayType>
        <BankHolidayOperation>
          <DaysOfOperation>
            <HolidayMondays />
          </DaysOfOperation>
        </BankHolidayOperation>
      </OperatingProfile>
      <VehicleJourneyCode>VJ44</VehicleJourneyCode>
      <ServiceRef>TCAT002</ServiceRef>
      <LineRef>SL1</LineRef>
      <JourneyPatternRef>JP22</JourneyPatternRef>
      <DepartureTime>13:45:00</DepartureTime>
    </VehicleJourney>

This was ending up with a DaysOfWeek of "NA" by https://github.com/ITSLeeds/UK2GTFS/blob/1a8cde20af5c3e11c65196e05ea3d73b4ebb86e0/R/transxchange_export.R#L31 and then a GTFS calendar indicating all-week running via https://github.com/ITSLeeds/UK2GTFS/blob/44b61eb0fad8ac3b770b6aa222f5057a504118ee/R/transxchange_export_functions.R#L215-L216 .

https://github.com/stupidpupil/UK2GTFS/commit/2f71970661e7ab73d4475126348a9604d4769089 is a really rough quick-fix to tackle this case, but I don't believe this is the right approach or that it'll work in general.

(I'm guessing that the right place to tackle this might be around https://github.com/ITSLeeds/UK2GTFS/blob/44b61eb0fad8ac3b770b6aa222f5057a504118ee/R/import_OperatingProfile.R#L30-L44 .)

mem48 commented 3 years ago

I think your commit is the best option. The issue is that DaysOfWeek can be NA because of HolidaysOnly but it also occurs on 7 day a weeks services where they didn't bother to specify.

But I've found a deeper problem. Of data being splits across similar named columns. When parsing this file the VehicleJourneys data frame has columns

BankHolidaysOperate
BankHolidaysNoOperate 
HolidaysOnly     
BHDaysOfOperation
BHDaysOfNonOperation

But the code that parses bank holidays only considers some of these columns

https://github.com/ITSLeeds/UK2GTFS/blob/44b61eb0fad8ac3b770b6aa222f5057a504118ee/R/transxchange_export.R#L318

I've just commited a fix that I think works, but needs testing on a wider range of files

stupidpupil commented 3 years ago

I'll have a go at testing it with my Wales work ASAP but it might be a little while.