microsoft / studentambassadors

This repository is for Microsoft Learn Student Ambassadors.
MIT License
130 stars 47 forks source link

Saved data partitioned files fail <data engineer> #217

Open PascalBurume opened 1 year ago

PascalBurume commented 1 year ago

Describe the bug

The parquet code execution did not save all the Parquet files in the transformed_data folder/Orders

What happened?

I encountered an issue while trying to save partitioned data files in transformed_orders using the following code:

transformed_df.write.mode("overwrite").parquet('Files/transformed_data/orders')

After executing the code, I expected to see a new folder named orders within the transformed_orders folder, containing one or more Parquet files. However, when I refreshed the Files node in the Explorer pane and checked the orders folder, I found that it contained only one Parquet file instead of the expected three.

What did you expect to happen?

The desired outcome is to have three Parquet files in transformed data and in the partitioned data with three folder; Each folder corresponding to a specific year: 2019, 2020, and 2021

How would you reproduce the bug?

Go to this link : Analyze data with Apache Spark. In 6 section name : Use Spark to transform data files got to Save the transformed data. image I should have the same result after typing the code but I got this: image

Did this happen via Desktop and/or Smartphone?

Desktop

Operating System

Windows 11

What browsers are you seeing the problem on?

Microsoft Edge

Version

Version 116.0.1938.27 (Official build) dev (64-bit)

Device

HP EliteBook 1030 x360

What is your current role?

Student Ambassador

What technical topic is this bug related to?

AI/Data Science/Machine Learning

Relevant log output

No response

github-actions[bot] commented 1 year ago

Thank you for submitting this issue! The team will review your issue, tag with the appropriate tags, and comment with any additional questions on information needed. :sparkles:

vivsridh4 commented 1 year ago

@PascalBurume, can you re-test this scenario; I am able to get an proceed without any issues.

PascalBurume commented 1 year ago

Okay, I will retest it Le 9 août 2023 à 6:08 AM +0100, Vivek Sridhar @.***>, a écrit :

@PascalBurumehttps://github.com/PascalBurume, can you re-test this scenario; I am able to get an proceed without any issues.

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/studentambassadors/issues/217#issuecomment-1670673674, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AXIPR7VJQQDUN3SO7SWC5ITXUMLLLANCNFSM6AAAAAA25IWVBY. You are receiving this because you were mentioned.Message ID: @.***>