xp-framework / rfc

One of the major deficiencies in the development of many projects is that there is no roadmap or strategy available other than in the developers' heads. The XP team publishes its decisions by documenting change requests in form of RFCs.
2 stars 1 forks source link

CsvMapReader / CsvMapWriter #235

Closed thekid closed 12 years ago

thekid commented 12 years ago

Scope of Change

Instead of working with integers describing the cell offsets in CSV files, we will add a class CsvMapReader to read data into a map, and a class CsvMapWriter to write data from a map to CSV files.

Rationale

Enable writing code to flexibly deal with changing field order.

Functionality

These classes will complement the "list", "bean" and "object" versions already inside the package text.csv.

Reading

Assuming the following CSV file:

id;realname;email
1549;Timm Friebe;timm@example.com

The current way to deal with this in a way that nothing breaks when the order of the fields changes is:

<?php
  $reader= new CsvListReader(new TextReader($input));
  $lookup= array_flip($reader->getHeaders());
  while ($record= $reader->read()) {
    $email= $record[$lookup['email']];
  }
?>

With the new CsvMapReader class, this functionality is built in

<?php
  $reader= new CsvMapReader(new TextReader($input));
  $reader->setKeys($reader->getHeaders());
  while ($record= $reader->read()) {
    $email= $record['email'];
  }
?>

Writing

Writing maps is almost the same as writing lists if we regard this on a per-line basis, since we can simply use array_values() to transform maps into lists. In the mode when we've set no headers, both versions are actually equivalent.

<?php
  $writer= new CsvListWriter(new TextWriter($output));
  $writer->setHeaders(array('id', 'email'));
  $writer->write(array_values(array(
    'id'    => 1,
    'email' => 'thekid@example.com'
  )));
?>

The difference is that the order of the headers defines the order in which the record's values are written, so even if our passed map would contain array('email' => '...', 'id' => 1), the order in the file would still be correct with the map writer implementation; while in the list writer implementation the email address would be written to the first field in the output, and id to the second.

<?php
  $writer= new CsvMapWriter(new TextWriter($output));
  $writer->setHeaders(array('id', 'email'));
  $writer->write(array(
    'id'    => 1,
    'email' => 'thekid@example.com'
  ));

  // Also yields correct result
  $writer->write(array(
    'email' => 'thekid@example.com',
    'id'    => 1
  ));
?>

Security considerations

None.

Speed impact

None, just two new classes.

Dependencies

None.

Related documents

http://supercsv.sourceforge.net/javadoc/org/supercsv/io/CsvMapReader.html http://grails.org/plugin/csv xp-framework/rfc#191 The original "new CSV API" RFC

thekid commented 12 years ago

:bulb: Should there be an overloaded version of setProcessor() / withProcessor() / setProcessors() / withProcessors() API which instead of integer offset accepts strings with the names of the maps' keys or would it be sufficient to have an easy accessor to retrieve the integer offset of a given key?

<?php
  $listWriter->setProcessor(0, new FormatDate('d.m.Y'));

  // #1: Overloaded versions
  $mapWriter->setProcessor('date', new FormatDate('d.m.Y'));

  // #2: int key(string $name) accessor
  $mapWriter->setProcessor($mapWriter->key('date'), new FormatDate('d.m.Y'));
?>

?

Especially in combination with xp-framework/rfc#234, the first option would make sense IMO.

thekid commented 12 years ago

Announced with http://news.planet-xp.net/article/463/2012/06/08/5.8.5_Release