Data-Liberation-Front / csvlint.rb

The gem behind http://csvlint.io
MIT License
283 stars 86 forks source link

Validation not working with StringIO input #177

Open opoudjis opened 8 years ago

opoudjis commented 8 years ago
validator = Csvlint::Validator.new( StringIO.new( csv ) , {}, csv_schema)

ends up calling

        @schema.validate_header(header, @source, @validate)

But because @source, the initial argument of Csvlint::Validator.new, is a StringIO, @schema.validate_header ends up attempting

table = tables[table_url]

where table_url is @source. Of course, that doesn't work: the index of tables is supposed to be a URL, even if the input to Validator is a StringIO.

Suggest testing the validation

opoudjis commented 8 years ago

Patched it with:

    table_url = tables.keys[0] if table_url.instance_of? StringIO

for Csvlint::Csvw::TableGroup::validate_header and Csvlint::Csvw::TableGroup::validate_row

adamc00 commented 8 years ago

Hi @opoudjis. I am not a team member of the project but have had a few pull requests accepted by the team. Your fix will probably get in more quickly if you fork the repo, apply your fix, add appropriate testing, and submit a pull request for your improvements against head.

opoudjis commented 8 years ago

Cool, I was going to do that anyway. Will let you know when I've done it.

Just to make sure I’m not completely abusing csvlint, this is my test script. Can I confirm with you there’s nothing strange about how I’m invoking it?

csv_schema = <<JSON
{
  "@context": "http://www.w3.org/ns/csvw",
  "null": true,
  "tables": [{
  "url1": "naplan_student_csv_csvw.csv",
  "tableSchema": {
     "columns": [
      {"name": "LocalId", "datatype": {"base": "string"}},
      {"name": "FamilyName", "datatype": {"base": "string"}},
      {"name": "GivenName", "datatype": {"base": "string"}},
      {"name": "Homegroup", "datatype": {"base": "string"}},
      {"name": "ClassCode", "datatype": {"base": "string"}},
      {"name": "ASLSchoolId", "datatype": {"base": "string"}},
      {"name": "SchoolLocalId", "datatype": {"base": "string"}},
      {"name": "LocalCampusId", "datatype": {"base": "string"}},
      {"name": "EmailAddress", "datatype": {"base": "string"}},
      {"name": "ReceiveAdditionalInformation", "datatype": {"base": "boolean", "format": "Y|N"}},
      {"name": "StaffSchoolRole", "datatype": {"base": "string"}}
    ]}}]
}
JSON
require 'csvlint'
require 'json'
csv_schema = Csvlint::Schema.from_csvw_metadata("http://example.com", JSON.parse(csv_schema))
validator = Csvlint::Validator.new( StringIO.new( csv ) , {}, csv_schema)
csv = <<CSV
LocalId,GivenName,FamilyName,Homegroup,ClassCode,ASLSchoolId,SchoolLocalId,LocalCampusId,EmailAddress,ReceiveAdditionalInformation,StaffSchoolRole
fjghh371,Treva,Seefeldt,7E,"7D,7E",knptb460,046129,01,tseefeldt@example.com,Y,teacher
CSV

validator = Csvlint::Validator.new( StringIO.new( csv ) , {}, csv_schema)
puts  "a" if validator.valid?
puts validator.errors
opoudjis commented 8 years ago

See https://github.com/nsip/csvlint.rb for patch

JeniT commented 8 years ago

@opoudjis as @adamc00 says, if you submit pull requests and the tests pass then we're a lot more likely to be able to apply changes quickly. Thanks.

adamc00 commented 8 years ago

@opoudjis, here's the pull request (PR) documentation in case you are unfamiliar. https://help.github.com/articles/creating-a-pull-request/