Closed lmiroslaw closed 4 years ago
The following procedure worked for me.
beegfsm [ID: 1]: reachable at 10.34.4.14:8008 (protocol: TCP)
beegfa57e000000 [ID: 1]: reachable at 10.34.4.4:8005 (protocol: TCP) beegfa57e000004 [ID: 2]: reachable at 10.34.4.8:8005 (protocol: TCP) beegfa57e000003 [ID: 3]: reachable at 10.34.4.7:8005 (protocol: TCP) beegfa57e000001 [ID: 4]: reachable at 10.34.4.5:8005 (protocol: TCP) beegfa57e000006 [ID: 5]: reachable at 10.34.4.12:8005 (protocol: TCP) beegfa57e000005 [ID: 6]: reachable at 10.34.4.6:8005 (protocol: TCP)
beegfa57e000001 [ID: 1]: reachable at 10.34.4.5:8003 (protocol: TCP) beegfa57e000003 [ID: 2]: reachable at 10.34.4.7:8003 (protocol: TCP) beegfa57e000004 [ID: 3]: reachable at 10.34.4.8:8003 (protocol: TCP) beegfa57e000000 [ID: 4]: reachable at 10.34.4.4:8003 (protocol: TCP) beegfa57e000006 [ID: 5]: reachable at 10.34.4.12:8003 (protocol: TCP) beegfa57e000005 [ID: 6]: reachable at 10.34.4.6:8003 (protocol: TCP)
We can see that 2 extra storage and metadata servers have been added.
It worked. Thanks. However, this is strange that I don't see the performance improvement when doubling the size of beegfsm. I am testing the performance by copying the 24GB folder between two locations: time cp sim sim3 -R The folder contains ca. 120 directories with several files in each in MB range (2.2M, 119MB, 47MB).
For small and bigger beegfsm I get the same result. real 2m26.809s user 0m0.461s sys 0m29.615s
vs real 2m32.859s user 0m0.440s sys 0m28.253s
IO Pattern: 55k reads, 50k writes, summing up to 90% of execution time.
I also tried to change the chunk_size with beegfs-ctl --setpattern --chunksize=1m --numtargets=8 /beegfs/chunksize_1m_4t to 1m, 64kB and 4m size with 8, 1, 8 targets, respectively.
This did not affect the results much.
Have you tried multiple cp's ? Maybe each cp to a different target. May need to determine if the source data is on 4 storage targets or more. Need to determine if reading or writing is slowing the performance.
Try to maximize the number of disks working on the I/O operation. beegfs-df can help to see what disks/targets are active.
First feedback: This is my first attempt to parallelize cp operation:
for i in {0..N}
do
cp -r $sourcedir/processor$i/* $destination/processor$i &
done
wait # wait for cp threads to finish
With this code I was able to reduce the copying time from 1m41sec to 58 secs. Now I will test the same code after doubling the size of the cluster.
Closed
How can I add the new storage or metaserver to the cluster?
I have tried to follow official documentation here, e.g. scale out the compute/beegfssm VMSSs, restarted the services but the nodes are still not recognized by the BeeGFS manager.
Am I am missing something?