Closed njtierney closed 3 years ago
Interesting! If I add on
pull(year) %>%
unique()
It's restricted to 2015 and 2018. Should those datasets be revisited?
This may be due by the wrong encoding of student_id
. It looks like a character, that reads exactly what we see in the table above.
I'll go back to see whether I introduce this error in the binding of the columns, or it comes from a wrong encoding in some of the underlying dataset (in that case you will have to act on your side).
So, tsibble::duplicates()
breaks my laptop for some reason, but I think I solved this. PR in the afternoon.
@njtierney @dicook care to give a look and eventually merge the PR?
following up on this issue, can someone please check when you have a second? or assign to yourself as a to-do
I had a similar problem working with this data a few days ago. But I can't remember the exact commit(s) that solved this. I think this was due to the class of the school_id
in a previous version stored the ID as integers/numeric, such that there were internal truncation or something similar. I checked for the AUS schools, there were only two schools, which is obviously incorrect. The latest version of check, using janitor::get_dupes
did not return any duplicated rows:
It looks like there might be some duplicates - see e.g.
student_id
is duplicated below.Created on 2019-12-18 by the reprex package (v0.3.0)