polomarcus / tp

GNU General Public License v3.0
6 stars 138 forks source link

Practical work

Tools you need for data engineering

You should have Docker and a Scala IDE working properly on your computer.

A Scala Integrated Development Environment (IDE) :

  1. Install the IDE "Intellij Idea Community": https://www.jetbrains.com/fr-fr/idea/download
  2. Install the Scala plugin, it will give you a SBT (Scala Build Tool) shell on the bottom of your IDE.
  3. Fork this repo to have your own copy.
  4. Clone your fork on your machine
  5. Open Intellij, go to "File" then "Open" and then only select the folder of the current exercice from your fork clone on your computer. Then, on IntelliJ set up the scala SDK (Software Development Kit) and JDK (Java SDK)

Common mistakes

Do you have one of these errors ?

Extracting Structure Failed
Cannot determine Java VM executable in selected JDK

This will solve your problem: do this Beware : For Spark code, we have to use Java 17 (JDK17) or inferior, otherwise you'll have this error

Now, you can restart the SBT shell using the button with the arrows.

[success] Total time: 31 s

After a successful restart you can execute run or test commands.

Docker and Compose

Take time to read and install

https://docs.docker.com/get-started/overview/

docker --version
Docker version XX.XX.XX

https://docs.docker.com/compose/

docker compose --version # Or docker compose --version
docker compose version X.XX.X

Fork OR update the repo on your own Github account

Fork

Update your fork