Open bom-bahadur opened 3 weeks ago
Hello,
We will do that soon.
For now, you can use MongoDB to store data.
To install it, follow these instructions for RHEL8/9: https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-red-hat/.
When installed, you must start the database with systemctl start mongod
.
Then you can just use RobinHood V4 to automatically create the database, for instance with rbh-sync: rbh-sync rbh:posix:/tmp rbh:mongo:test_db
.
This will create the database and the collection at the same time, the collection being entries
.
So after the rbh-sync
, you can do mongosh
to check the database, and then:
use test_db
db.entries.find()
to show all entries in the database.
Alternatively, after the sync, you can do a rbh-find rbh:mongo:test_db
to search all entries in the database through rbh-find
.
Kind regards, Yoann Valeri
Hello Yoann,
Thanks for your quick response.
We will setup MongoDB to store data.
We have Rocky Linux 9.4
with lustre client (lctl 2.14.0_ddn154
).
And lustre file sytem is mounted at /ictstr01
Is below syntax correct ?
rbh:lustre:/ictstr01 rbh:mongo:/robinhood/test_db
/robinhood
- local mount with SSD disks.
Many Thanks Bom Singiali
For the lustre part, it's the good syntax.
For the Mongo part however, you only have to specify the name of the database in Mongo.
Where that database is stored is put to Mongo however, but you can check it in the /etc/mongod.conf
file.
By default, databases/collections are stored in /var/lib/mongo
, so you can just change that path to /robinhood/test_db
.
Once you have made this change, the correct syntax for the whole command would be:
rbh-sync rbh:lustre:/ictstr01 rbh:mongo:test_db
Kind regards, Yoann Valeri
Thanks Yoann,
Mongo DB has been setup. However, rbh-sync has error:
root@lustre-stats ~]# rbh-sync rbh:lustre:/ictstr01 rbh:mongo_db_lustre
rbh-sync: Cannot detect given backend: Invalid argument
[root@lustre-stats ~]# systemctl status mongod.service
● mongod.service - MongoDB Database Server
Loaded: loaded (/usr/lib/systemd/system/mongod.service; enabled; preset: disabled)
Active: active (running) since Wed 2024-11-13 15:16:03 CET; 3min 51s ago
Docs: https://docs.mongodb.org/manual
Main PID: 167328 (mongod)
Memory: 215.2M
CPU: 2.451s
CGroup: /system.slice/mongod.service
└─167328 /usr/bin/mongod -f /etc/mongod.conf
Nov 13 15:16:03 lustre-stats.scidom.de systemd[1]: Started MongoDB Database Server.
Nov 13 15:16:03 lustre-stats.scidom.de mongod[167328]: {"t":{"$date":"2024-11-13T14:16:03.963Z"},"s":"I", "c":"CONTROL", "id":7484500, "ctx":"main","msg":"Environment variable MONGODB_CONFIG_OVERRIDE_NOFORK == 1, overriding \"processManagement.fork\" >
Please advice, thanks.
Best Regards Bom Singiali
Hello,
If I take the code snippet literally, there is a missing :
between mongo
and db_lustre
for the second URI.
The rbh-sync
line should rather be:
rbh-sync rbh:lustre:/ictstr01 rbh:mongo:db_lustre
Kind regards, Yoann
Thanks, previous error has been resolved.
Here is input used and standard output from terminal:
[root@lustre-stats ~]# rbh-sync rbh:lustre:/ictstr01 rbh:mongo:mongo_db_lustre
Failed to stat '/boost_ai/users/test/bom.singiali/.testfile2.swp': No such file or directory (2)
Synchronization of '/ictstr01/boost_ai/users/test/bom.singiali/.testfile2.swp' skipped
This just means that there was an error trying to stat that particular file. Perhaps during the scan, the file was removed from the directory, as in rbh-sync started scanning the directory the file is in, see that there is this file to scan, but when it actually starts to scan it (here, a stat), the file has been removed from the file system. The error is not critical, as written by rbh-sync, the entry was simply skipped.
And now you should have all the entries to tried to scan in your mongo database mongo_db_lustre
(I've written above how to check them), assuming the rbh-sync is over of course.
Thanks Yoann,
Sync is on-going.
We have ~18 PiB (14 PiB used) in Lustre.
Metadata ingestion to MongoDB is only ~10 GB per day. Is this expected? Do you suggestion any optimizations to speed up this process? Thanks.
How many inodes does your system have ? The raw capacity isn't a useful metric for RobinHood, the most relevant is the number of inodes.
If you want to know how many inodes RobinHood handles, you can check the database and use the command db.entries.count()
.
That will show you the number of inodes in the database, so if you do it once, note the number, wait an hour and do it again, you'll know how much inodes RobinHood can roughly handle.
Without those, I can't really tell you if what you get is expected or not.
However, if you want to speed up the process, you can either spawn multiple processes that will handle a subdirectory of your main system; or you can use the MPI File Utils backend.
For the former, you must use the branch
feature of RobinHood V4, like this:
rbh-sync rbh:lustre:<main_directory>#<sub_directory> rbh:mongo:<your_db>
, and then start this command in background multiple times, changing the sub_directory
each time.
For the latter, you have to use the lustre MPI backend of RBH V4:
rbh-sync rbh:lustre-mpi:<directory> rbh:mongo:<your_db>
. You must use this command behind a mpirun, for that I'll let you check the MPI File Utils documentation (https://mpifileutils.readthedocs.io/en/v0.11.1/index.html).
The first solution will be a single-node, multi-process improvement, while the second one is a multi-node, multi-process improvement, but requires additional architecture to work nicely.
Currently, those are the only two things I can suggest, working on the tools and the database to improve the speed are the next thing on our todo list after rbh-report is done.
Tell me if you need additional help :)
Hello there,
Would be great to have Admin Guide, just like we had in V3:
https://github.com/cea-hpc/robinhood/wiki/robinhood_v3_admin_doc#user-content-Database_setup_and_tuning
Many Thanks Bom Singiali