hadley / r4ds

R for data science: a book
http://r4ds.hadley.nz
Other
4.51k stars 4.19k forks source link

Possible problem with the Seattle-Library-dataset #1594

Closed luisfpsc closed 10 months ago

luisfpsc commented 10 months ago

Hello Mr. Hadley,

First of all, I would like to thank you for your work. R4DS has helped a lot in my new job and in my master's.

I've recently tried to have a go at arrow in the 23th chapter of your book and I ran into a problemwhit the Seattle Library csv. I tried to run the following code:

seattle_csv |> 
  group_by(CheckoutYear) |> 
  write_dataset(path = pq_path, 
                format = 'parquet')

I got the following error message: Invalid: In CSV column #7: Row #106154: CSV conversion error to null: invalid value '0394829131, 0394929136'

I don't know if it happened because the original file was updated in the source. I can try to run similar code using other data, but just thought it might be helpful to let you know of possible problems with the data.

Again, thanks a lot for the work that you and your colleagues have done.

Best Regards,

Luis Filipe Cossi

mine-cetinkaya-rundel commented 10 months ago

Thank you for reporting! We're already tracking this in #1374 and #1533, I'll close this in favor of those.