fabragaMS / ADPE2E

Azure Data Platform End-to-End
343 stars 239 forks source link

Yellow Trip Data CSV Files have a double header #2

Closed MarchingBug closed 5 years ago

MarchingBug commented 5 years ago

First of all, fantastic workshop, thank you so much for putting this together.

Lab 2 - Azure Data Factory - Copy NYC Taxi Data to Data Warehouse fails.

I looked at the source files on the blob storage and they have a double header

image

The first line is what is causing the error. I tried downloaded them from the blob files but it is taking too long.

I wanted to let you know.

fabragaMS commented 5 years ago

Hey MarchingBug, thanks for that. I'm glad you enjoyed the content.

As to the Taxi files, that's how they are saved at the source website: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page

The Lab 2 instructions take that into consideration so everything runs smoothly. So, please let me know if you find any problems even following the lab steps. Thank you!

MarchingBug commented 5 years ago

I tried Lab 2 several times, but I was not able to get it running, my thought was to add a first line to remove the "header" for the files.

I am going to try that. Your Workshop is amazing, I am delivering a one day Data Estate Workshop roadshow in the northeast for our EDU customers next month, your lab if far superior to what I was putting together.

fabragaMS commented 5 years ago

Hi Ana, about Lab 2...as a suggestion could you replace the dataset definition with the JSON provided in the lab instructions. I tried myself here and it works fine...I'm unable to reproduce the issue.

fabragaMS commented 5 years ago

Hi Ana, just wondering if you managed to complete the labs? Let me know if you still have any problems so I can close this issue. Thanks!