MobilityData / gtfs-validator

Canonical GTFS Validator project for schedule (static) files.
https://gtfs-validator.mobilitydata.org/
Apache License 2.0
290 stars 101 forks source link

Output fieldName of missing_recommended_field notice even if there is no column #1425

Closed takohei closed 1 year ago

takohei commented 1 year ago

Describe the bug

It is specified in RULES.md that the notice of missing_recommended_field should have fieldNames.

However, if there is no column in the header, the fieldName is not output. So it is difficult for us to identify which field is unset.

Steps/Code to Reproduce

Validate attached feed. It has no column for feed_contact_email and no value for feed_contact_url in the feed_info.txt.

Expected Results

In report.json, both feed_contact_email and feed_contact_url notices have fieldNames.

    {
      "code": "missing_recommended_field",
      "severity": "WARNING",
      "totalNotices": 2,
      "sampleNotices": [
        {
          "filename": "feed_info.txt",
          "csvRowNumber": 2.0
          "fieldName": "feed_contact_email "
        },
        {
          "filename": "feed_info.txt",
          "csvRowNumber": 2.0,
          "fieldName": "feed_contact_url"
        }
      ]
    }

Actual Results

In report.json, the feed_contact_url notice has fieldName, but the feed_contact_email notice has no fieldName.

    {
      "code": "missing_recommended_field",
      "severity": "WARNING",
      "totalNotices": 2,
      "sampleNotices": [
        {
          "filename": "feed_info.txt",
          "csvRowNumber": 2.0
        },
        {
          "filename": "feed_info.txt",
          "csvRowNumber": 2.0,
          "fieldName": "feed_contact_url"
        }
      ]
    }

Screenshots

The detail table of report.json has value but no header of fieldName. image

Files used

no_mail_column_no_url_value_akocity.zip report.json.txt

Validator version

4.1.0

Operating system

Windows

Java version

18

Additional notes

This is just a guess as I do not have a development environment, but the following code may not get the column name if there is no column header.

https://github.com/MobilityData/gtfs-validator/blob/19ed06d894a0ee44f104bf54cae3301d3c7ef7ed/core/src/main/java/org/mobilitydata/gtfsvalidator/parsing/RowParser.java#L134-L138

takohei commented 1 year ago

I checked the variables using the debugger and my guess was correct.

If a recommended column is not in the header, columnIndex will be -1 and getColumnName() will not return the column name, so the feedName will not be output. I am not sure how to fix it properly.

isabelle-dr commented 1 year ago

Thank you for opening this issue @takohei! Our team will have a look at it shortly.

takohei commented 1 year ago

@isabelle-dr Thanks for your reply and triage.

As a test, I changed the argument of axX()@RowParser.java from FieldLevelEnum to GtfsColumnDescriptor and it worked well. As follows.

  public String asString(int columnIndex, GtfsColumnDescriptor columnDescriptor) {
    String s = row.asString(columnIndex);
    if (columnDescriptor.fieldLevel() == FieldLevelEnum.REQUIRED && s == null) {
      noticeContainer.addValidationNotice(
          new MissingRequiredFieldNotice(
              fileName, getRowNumber(), columnDescriptor.columnName()));
    } else if (columnDescriptor.fieldLevel() == FieldLevelEnum.RECOMMENDED && s == null) {
      noticeContainer.addValidationNotice(
          new MissingRecommendedFieldNotice(
              fileName, getRowNumber(), columnDescriptor.columnName()));
    }
qcdyx commented 1 year ago

Hello @takohei, you are right. Using the GtfsColumnDescriptor is the way to go. I fix this issue by using this approach. Thanks for digging deep into the code.