data-engineering-community / data-engineering-wiki

The best place to learn data engineering. Built and maintained by the data engineering community.
https://dataengineering.wiki
Creative Commons Zero v1.0 Universal
1.22k stars 120 forks source link

Added: Updated Apache Spark info + RDD example on Tutorials folder #11

Closed icharo-tb closed 1 year ago

icharo-tb commented 1 year ago

I updated some information on Apache Spark tool. Also made a Tutorial on some simple SparkContext RDD operations.

I may be doing something wrong with branches since the PR is also requestig the Hadoop changes you did. I need to check this out to make it better in a future. If there is something wrong or that need to be changed, tell me.

JPHaus commented 1 year ago

I updated some information on Apache Spark tool. Also made a Tutorial on some simple SparkContext RDD operations.

I may be doing something wrong with branches since the PR is also requestig the Hadoop changes you did. I need to check this out to make it better in a future. If there is something wrong or that need to be changed, tell me.

What you'll need to do is get the latest changes from data-engineering-community:main. See this summary for how to update your fork.

icharo-tb commented 1 year ago

Ohh! So I was not doing the fetch correctly as I can read on the link you gave me. I will check it out tomorrow in the morning. Thanks! :)

icharo-tb commented 1 year ago

Ok, I updated the fork, and I don't know why, it added a change on the md Hadoop file copying the table and adding an upstream/main tag, Its quite troublesome. Still, I hope its everything alright now, and hope you see the PR interesting to add it!