Open DanielRamirezSanchez opened 5 years ago
Since I'm only using 2 DataNodes, the terasort took quite some time (17 minutes if I can recall correctly), but everything worked fine. Process documented.
Got stuck like an idiot while trying to put the Zip file into HDFS... I was doing everything in my home directory and user HDFS could see nothing there. Moved to a diferent folder to download the zip and then upload it to HDFS.
The folder made snapshottable can't be deleted. The content (zip file) no problem. It has been deleted and restored from the snapshot. Process documented
The folder couldn't be deleted because, obviously, it contained the snapshot. So deleting the folder would mean deleting the snapshots inside it.
Created /dfs/jn folders with correct owner, group and permission (drwx------ hdfs hadoop) in each node where I'm putting the JournalNodes
I took the screenshot of the HDFS Instances tab, The concerning health problems are related with the disk space, since the previous exercice (teragen, terasort) took a big piece of space for the generated files.
This is the imaged of the HA I was talking in the previous comment
Created the user specified in the exercise. After switching user, limited the admin user and took screenshot.
The Cloudera Manager of my cluster can be accessed through: http://sebc-vm1.westeurope.cloudapp.azure.com:7180/cmf/home
If you have problems connecting to it, let me know your IP to grant you access through the Microsoft Azure Networking configuration.
Tried to comunicate with other clusters with distcp (mine is Azure, the other clusters where AWS and Google) but we didn't manage. Some kind of problems with permissions. But the commad was working correctly in our own clusters, so we went that way instead of losing more time with the communication between clusters.