invinst / invisible-flow

9 stars 2 forks source link

Transforming foia response data #12

Open adesca opened 5 years ago

adesca commented 5 years ago

AS Sam GIVEN that Alex has uploaded a 060 foia response I WANT the data formatted to match the schema provided in the definitions card AND THEN saved as a csv under [timestamp]\transformed WITH a UNIQUE KEY dependent on the UNIQUE KEY used by the schema for that document

This should be done for each of the datasets defined in the definitions card

adesca commented 5 years ago

Based on the current state of master it's possible to upload a file and have it saved appropriately. I would thus suggest

  1. Creating a task for each upload report type as defined in issue #6
  2. And then for each report type, determine the unique key used by the database as specified by the schema here
  3. Implement LocalStorage.get to return the string content of the stored files
  4. Use a builtin python library csv to parse the string 5.. Use the implementation from #11 to convert the parsed version into an entity
  5. Save that entity as a file
adesca commented 5 years ago

Steps that need to be done for all foia response types

  1. Map the response columns in the raw csv to the appropriate database columns.
  2. Using this mapping, implement transformation logic that maps from the foia response to their respective database entities
    • Be sure to refer to the existing code, for case info, to see an example of how we should do this
  3. Save the resulting csv string to gcs, as {}.csv, where {} is the name of the table(s) that the csv was transformed to. For example, case_info.csv would change to data_allegations.csv

Here are the foia response types:

tw-jeff-burroughs commented 5 years ago

I am going to try and work on the accused foia type

adesca commented 5 years ago

I am going to work on the civilian witness foia type

Tohoma commented 5 years ago

I'll be working on the investigators foia type

adesca commented 5 years ago

Civilian witness looks like it has data that isn't used by the II database, so I'm marking it as done. Now starting on complainant.

adesca commented 5 years ago

Working on cpd witness

adesca commented 5 years ago

Skipping cpd_witness because it feeds into 4 different tables (officer history, badge number, investigator, officer allegation). WIll be done as a separate story

terrencebunkley commented 5 years ago

@adesca We are already performing transformations that feed into multiple tables. I worked with @tw-jeff-burroughs on one yesterday . I think this should stay in the same story. It is just a different task.

tw-jeff-burroughs commented 5 years ago

~Exploring cpd witness transform~ Data for transforming into separate files/tables seems incomplete. A separate story/task will be created for this