rubyforgood / pet-rescue

Pet Rescue is an application making it easy to link adopters/fosters with pets. We work with grassroots pet rescue organizations to understand how we can make the most impact.
MIT License
57 stars 95 forks source link

CSV upload: write a CSV import service #869

Closed kasugaijin closed 3 weeks ago

kasugaijin commented 1 month ago

We need to be able to import a CSV of third party form data containing adopter responses to questions. We will need to be able to match a Person in the app with a row in the CSV using an email address, and then save a new FormAnswer record for each question and answer - we will save the question and the answer on the record. All of the FormAnswers will belong to a single FormSubmission record, which acts as a binder for all of the FormAnswers, representing a submission of data at a given time.

Note that we need to be able to determine if a given row has already been imported. If it has, we skip it. If it has not, we import it if there is a matching Person. To be able to determine if we have already imported, we need to investigate a little into what data is provided in form CSV outputs by common services like Google Forms. We might be able to use a date/timestamp and save that on the FormSubmission...if a person has a FormSubmission with a given timestamp, we skip that row. If not, we treat it as a new submission and import it.

Ask away with questions! Note that the FormAnswer model still needs to be renamed (issue https://github.com/rubyforgood/pet-rescue/issues/867)

jmilljr24 commented 1 month ago

I took a peak at this and a couple of questions came to mind. Looking at a sample Google Form csv, I think checking the db for a person and datetime (timestamp from row) will be the easiest solution.

  1. Will FormAnswer belong to User or Person? FormSubmission is indexed on Person but I don't see that change listed for FormAnswer in #867 .
  2. Person query based on email could have issues based off of how the email was entered/saved in the csv (case sensitivity). I know it can be dealt with in the query but I'm not sure if that's the best solution. Here is a quick read on some options.
  3. If the email/person is not in the database do we do anything?
kasugaijin commented 1 month ago

@jmilljr24 This issue is currently updating FormAnswer https://github.com/rubyforgood/pet-rescue/issues/867. And, so FormAnswer will belong to a FormSubmission, which will belong to a Person. We might want to think of a nicer way to get the email than form_answer.form_submission.person.user.email though. This CSV service will technically be blocked until that issue is done…but certainly there’s a lot that can be done before then, too.

I didn’t think about email case in the db so I appreciate you bringing that up. We should select an appropriate solution on our call!

Yes we need to think about how to handle these cases. I think that if the email is in the CSV but not in the DB under a User record, we do nothing.