ontohub / ontohub

A web-based repository for distributed ontologies.
GNU Affero General Public License v3.0
32 stars 8 forks source link

new server for ontohub #282

Closed tillmo closed 8 years ago

tillmo commented 10 years ago

The VMs have now 8GB memory, so I postpone this for now

tillmo commented 10 years ago

OK, we need to specify what server hardware to buy. In general, we should consider a decentralised architecture. This means that the Hets processes will run on many different small servers, and the central server only needs to take care of the web page, the database and the git repositories. Should be buy one big server hosting 3 VMs (develop, staging, production), or three smaller servers, each of which is dedicated to a single task?

jelmd1 commented 10 years ago

Hmm, really good question! As a normal admin and looking at the README I would try to get away with a single "entry level" server.

I.e. not more than 2 Sockets, but as many cores + threads aka HW strands as possible. That usually minimizes the maintenance burden and complexity of the boxes/OS and applications, reduces single point of failures/unpredictable behavior... It also allows to really share resources, so that all applications can benefit from having "a lot of HW [threads/RAM/etc]" available (I guess, that usually the apps are not running [always] at max. at the very same time).

Wrt. to DB the traditional rule of thumb is "The more spindles the better", because unless it is a "large" stream/sequential workload, the DB performance heavily depends on the available IOPS. And this in turn also suggests to use SAS-HDDs, because a SAS-HDD usually provides ~50..100% more random IOPS than a comparable SATA-HDD (which usually runs at 7 kRPM today) + SAS-HDDs can work in a bidirectional way (RW at the same time), whereby SATA is always unidirectional.

However, since RAM gets cheaper and cheaper, one can mitigate "slow" disks by keeping the working set cached in RAM (i.e. apps might be much happier with a lot of RAM instead of a lot of very fast spindles -> 10K instead of 15K drives). Than the biggest HDD impact can probably seen, when warming up the cache or writing occurs.

For the latter problem one may use an SSD (for best performance a RAM based one), which usually improves DB performance drastically.

Taking all this into account, and the ability of Solaris to run several native zones side by side, share the resources between them and allows one to cap its resources if necessary, I suggest something like http://iws.cs.uni-magdeburg.de/~elkner/supermicro/server3.html having a config like this in mind:

Using ZFS one can easily snapshot/clone e.g. a ZFS from a production or other zone within seconds (doesn't really matter how big it is), test something new using basically the production environment and destroy it (or not) when done, or roll it back in < 1-2 seconds to its previous state to do the same test with different parameters, etc.. IMHO ideal for development.

So IMHO having one rock solid, high performance server, is much better, than waisting resources with several small, half-hearted servers with limited performance. Anyway, I made a lot of assumptions wrt. the workload/working sets, and thus I might be completely wrong of course.

PS: The ~ Brutto is the "single piece" inet price I've found to get a "feeling". So if a similar machine gets shipped by a [local] vendor, one probably needs to add ~20..30% ...

tillmo commented 10 years ago

Henning, Julian, can you comment on jelmd1 (he is the admin responsible for buying the server)?

nning commented 10 years ago

I wanted to discuss this with the others at our next meeting before commenting. On the hardware considerations of @jelmd1, I would agree absolutely; but I am sure, Solaris would increase our administration overhead, because of unawareness and bad Solaris support of gems.

tillmo commented 10 years ago

what about having Linux VMs on a Solaris host?

nning commented 10 years ago

I think, Solaris supports virtualizing Linux only with VirtualBox which is not a professional server virtualization software, because it has a crude, unusable terminal interface, so you're forced to use the QT GUI via SSH X forwarding. I was forced once running Ubuntu on Solaris/VirtualBox and we even encountered random kernel panics with the then-current Ubuntu LTS version.

tillmo commented 10 years ago

It seems that zones also supports Linux guests: https://en.wikipedia.org/wiki/Solaris_Containers

nning commented 10 years ago

I wouldn't call syscall translation for RHEL3 "supports Linux guests". ;-) If we actually want partitioning without full-blown virtual machines, we could also use LXC on Linux. Snapshots and clones also work with LVM.

jelmd commented 10 years ago

Well, Solaris 11 dropped support for Linux native zones. However, a similar (but not yet tried) alternative could be SmartOS, which is a fork of the last OpenSolaris version enhanced with KVM support (IIRC, they basically took the Linux KVM and ported it to the OS) and added a lot of new promising features, which helps e.g. Joyent to drive its clouds. And nning is right, VirtualBox runs on Solaris (no wonder, its in Oracle's hands), but never had the need to run it on a server. I've heard, that VirtualBox performance is the worst wrt. all other alternatives, but when I rarely run it on my desktop to test some windows stuff, it is ok for me. Anyway, no matter what VM is used, any will suffer from the overhead of lower layers, which usually results in bad IO. Solaris native zones, because those are basically just native processes with a Zone-ID tag, do no get such a penalty (Brendan's blog has more on this). So at least wrt. DBs and VMs I've mixed feelings.

Wrt. ruby: don't know much about it, just that it is supported on Solaris (current version is 1.8.7 (2013-06-27 patchlevel 374)). Probably not the latest one, but as with any other enterprise OS (e.g. debian), they need their time to test it and usually wanna make sure, that the stuff is backward compatible, i.e. no upgrade breaks any running stuff. However, compiling/using a more recent version shouldn't be a problem.

Can't comment on the Linux based solutions, because never used :(. But since it is your server, you decide ;-)

tillmo commented 10 years ago

With @jelmd, I have discussed the following: a hybrid Linux-Solaris solution does not make sense. The server will be intially run under Linux (and indeed LXC might be the equivalent for Solaris' zones). @jelmd will try to port Ontohub and Hets to Solaris. If this succeeds, the server will be migrated to Solaris.

tillmo commented 10 years ago

I just have been told by an admin that he uses a hybrid Linux-Solaris solution, namely Illumos-KVM. VMs are encapsulated in minimalistic Solaris zones, with network virtualisation via "Crossbow". ZFS volumes are used as virtual hard disks.

jelmd commented 10 years ago

Yes, SmartOS is definitely very interesting and can be used. But as said, IMHO running a server in a VM (no matter, which one) is often not a "close to optimal" solution: see http://dtrace.org/blogs/brendan/2013/01/11/virtualization-performance-zones-kvm-xen/

tillmo commented 10 years ago

agreed, avoiding VMs brings better performance. However, since we have discussed above to start with Linux, this would mean a container solution under Linux. Do you think that LXC is a good solution? There is also OpenVZ and Linux-VServer. (I do not know any of these...)

jelmd commented 10 years ago

"LXC is a good solution?" No, IMHO not yet - from what I've read so far, I would describe it as a good start. E.g. [6]: "... All three systems showed poor performance isolation for memory, disk and network. ..." (And when I read SELinux/AppAmor and the other huge maintenance overhead, which needs to be done, my hair starts getting grayer immediately ;-) )

However, with Linux as the primary target, according to the docs below, LXC is probably the best/most reliable thing, one can get:

corny commented 10 years ago

I suggest to use enterprise SSDs like the Samsung SM843T which is available with 120GB, 240GB, 480GB and 960GB. Its price is around 1 € per GB: Pricing: http://www.preissuchmaschine.de/preisvergleich/produkt.cgi?suche=samsung+sm843t Review and Benchmarks: http://www.storagereview.com/samsung_sm843_enterprise_ssd_review

DanielCoutoVale commented 10 years ago

Hi Julian,

last time I configured a server, an SSD had a worse linear writing time and a worse maximum rewrite capability (40 times) per cell. So I chose the storage contextually by the kind of usage I made of it and not standardly by the kind of technology such as SSD or HD.

Questions that I would make:

1) Do we expect the same point in memory to be overwritten several times? Do we have hardware specs about the number of rewrite times each storage device is capable of.

2) Do we expect to write often and, if so, randomly or linearly? Do we have hardware specs about how fast the random and linear writings are?

3) Do we expect to read often and, if so, randomly or linearly? Do we have hardware specs about how fast the random and linear readings are?

Best, Daniel.

On 13 Dec 2013, at 11:09, Julian K. notifications@github.com wrote:

I would suggest to use enterprise SSDs like the Samsung SM843T which is available with 120GB, 240GB, 480GB and 960GB. Its price is around 1 € per GB.

— Reply to this email directly or view it on GitHub.

corny commented 10 years ago

@DanielCoutoVale Which SSD did you use in your configuration?

DanielCoutoVale commented 10 years ago

I didn’t use an SSD. I used an HD at the time (4,5 years ago) because it supported 80 rewrites and the comparable SSD supported 40 rewrites. I have not checked the evolution of specifications of the SSDs in the last 4,5 years, so my comment was not a suggestion of a particular SSD of a particular brand, but a suggestion of which parameters to consider in order to choose a storage.

On 13 Dec 2013, at 12:44, Julian K. notifications@github.com wrote:

@DanielCoutoVale Which SSD did you use in your configuration?

— Reply to this email directly or view it on GitHub.

corny commented 10 years ago

"The Samsung SM843T is optimized for sustained random read and write workloads up to 1,930 TBW (480GB) and up to 3,680 TBW with the 960GB capacity SSD which represents a 30x improvement of its predecessor."

The 960GB drive is specified to support 3,860 Random Terabytes Written (RTW), which means 4020 rewrites or "2 write drives per day for 5 years".

http://www.samsung.com/global/business/semiconductor/file/media/SM843T_Product_Overview-0.pdf

DanielCoutoVale commented 10 years ago

Ok. Then the problem I was facing than is absolutely outdated. :-)

On 13 Dec 2013, at 15:33, Julian K. notifications@github.com wrote:

sustained

jelmd commented 10 years ago

Hi Julian,

IMHO SM843T is a poor choice and the term "enterprise" seems to be solely related to the fact, that it is overprovisioned. Many vendors just overprovision its consumer grade SSDs and than call it "enterprise" because just of this they get a higher endurance. Overprovisioning in this context usually means to make only ~60-70% of the real capacity visible to the application (consumer grade ~90-95% IIRC), and thus the controller has enough space to phase out/remap bad cells and doesn't need to spend a lot of time to find a hole big enough to write some data in a row (easier wear leveling).

Also its write performance is pretty poor (which makes me think, that they just took a consumer grade SSD and put some RAM into it to accellerate reading using internally some more or less good prefetching). 11KIOPS is not that good, and this number is probably for optimzed writes - SSDs tend to be organized in 4KB blocks. If an app writes using 8 KB (which many DBs do, some use even bigger blocks), it probably drops down to ~5 KIOPS. Compare this with modern RAM-backed real Enterprise SSDs, like http://www.fusionio.com/data-sheets/iodrive2-duo/ or http://www.stec-inc.com/products/s1120-pcie-accelerator/ . On the Stec data sheet one can read also a very important little word: steady! Most non-RAM backed SSDs just drop in its performance [dramatically] after a more or less short time and thus I think, after a while even 5K is a pretty optimitic value for the SM843T. In the tests you referred to one can see this pretty good (look at the micron nightmare ;-) ). Another/related thing is latency - the thing everybody whishes it would be close to zero ;-) - here one can see, that from time to time SSD use to take a little breath, whereby for entperprise grade SSDs little means little. But look at the SM843T: it seems that this one takes from time to time a really deep/long breath (is it filling/updating its RAM cache again?) - this would drive you nuts, when you try to find out, why your app seems to stall from time to time or why it behaves so "randomly" ...

Finally I haven't found anything about a super-capacitator or something like that, which protects the SSD RAM content by providing still enough power to write everything back to the flash, when the machine looses its power. If one say, well it doesn't happen very often and if so and crashes my DB/App Data that's not a big thing, than one may not miss such a feature. But for enterprise grade applications, data reliability/consistency is a must and that's why and all the stuff above I would call the SM843T not enterprise but "just a pimped consumer grade SSD". ;-)

corny commented 10 years ago

Hi jelmd, thanks a lot for your explanations. I am not so familiar with that enterprise stuff and the usage of SSDs in servers. I agree with you that we should prefer classic HDDs.

tillmo commented 10 years ago

here, the fact that enterprise HDs are better than consumer HDs is questioned. http://www.admin-magazin.de/News/Tipps/ADMIN-Tipp-Lohnen-Enterprise-Platten?utm_source=ADMIN+Newsletter&utm_campaign=ADMIN-Newsletter-2013-49&utm_medium=email

jelmd commented 10 years ago

Ehhmm, I wrote about SSDs ;-)

Anyway, the HDD article just focuses on the AFR. It doesn't even mentioned the drives used. Since this company seems to be a backup company, "enterprise" in this context probably means NL aka NearLine Storage, which have an SAS interface and much bigger capacity than usual enterprise aka SAS/FC HDDs, but mechanically they are equivalent to SATA drives (e.g. run only with 7KRPM, thus have a higher latency, less throughput, and less - i.e. ~0.5 - IOPS), which makes them much cheaper (IIRC at least for the 1TB+ HDDs SATA and NL HDD price is almost the same). Also the author says, that their enterprise drives are 2 years old, only. Our experience is, that if a drive crashes/has performance problems, this occurs usually in the 1st year (monday production ;-)). After the 1st year one may expect, that it runs another 8-10 years without any problems. So ...

Anyway, yes, desktop drives are much more reliable than 10 years ago and got closer to the enterprise drives, depending on the model even equal, however I would take his results with a big grain of salt. IMHO he basically says: Don't buy Learjets but Cessnas because both fly, but Cessnas are much cheaper ;-)

jelmd commented 10 years ago

Oh I forgot, because he mentioned vibration: http://www.youtube.com/watch?v=tDacjrSCeq4 ;-)

corny commented 10 years ago

@tillmo The E5-2690v2 with 10 cores is out now: http://ark.intel.com/products/75279/Intel-Xeon-Processor-E5-2690-v2-25M-Cache-3_00-GHz

corny commented 10 years ago

@jelmd I found some articles about MySQL perfomance concering SSDs. The benchmark of the first article shows that a Intel DC S3500 is more than twice as fast than a Samsung 840 on 16KiB random writes.

Seagate Savvio 10K.5 vs Intel DC S3500:

jelmd commented 10 years ago

Hi corny,

Have fun, jel.

tillmo commented 10 years ago

We have come to the following conclusion, considerung that we won't use ZFS under Linux and therefore prefer SSDs over HDs+expensive accelerators:

tillmo commented 10 years ago

@jelmd asks: how shall the server be partitioned? what OS (Unbuntu LTS server?)? which file system (@jelmd suggests xfs)?

tillmo commented 10 years ago

The configuration is quite clear now. The only remaining question: Intel DC S3700 (not 3500!) or Seagate 600 pro?

corny commented 10 years ago

The Intel DC S37000 has the best performance and the chapest price per TB/written, but costs around 2 € per GB. The difference between the 600 Pro with 400GB (2700 rewrites) and 480GB (875 rewrites) is that the smaller one is more over-provisioned.

Here you can find a comparision: http://www.tomshardware.com/reviews/ssd-dc-s3500-review-6gbps,3529-4.html

I think we should buy 4-6 of the DC S37000 400GB for read/write intensive applications (i.e. database) and some Seagate Savvio 10K.5 for less IO intensive applications.

@jelmd Which configuration would you suggest if we want to mix SSDs and HDDs? Do you think that we require such an high write endurance?

corny commented 10 years ago

@jelmd1 I would like to have Ubuntu 13.10 on the server because it contains more recent packages and it is upgradable to 14.04 LTS.

@nning do you prefer a specific file system? I have good expeirence with EXT4 and XFS, jelmd prefers XFS.

tillmo commented 10 years ago

I have a question regarding the new configuration at http://iws.cs.uni-magdeburg.de/~elkner/supermicro/server3.html The HDDs cost about 40 cents per GB. Couldn't we take much cheaper conusmer HDDs (available for about 6 cents per GB, for 2,5'' HDDs), noticing that these shall be used for temporary proof files only? Even if these HDDs fail more quickly: we could buy 6 times as much capacity for the same price, so we could buy some reserve capacity balancing out the quicker failure time.

jelmd commented 10 years ago

@corny Yes, I prefer the S3700 as well, because its performance is ok and much more predictable/steady (+-5% wrt. average) than the Seagate part (+-75% wrt. average). Just have a look at the "Perfomance Consistency" charts on the THW URL you gave or http://www.anandtech.com/show/7065/intel-ssd-dc-s3500-review-480gb-part-1/3 or the spec http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/ssd-dc-s3700-spec.pdf.

2 HotSpares for the main "pool" is a must on servers IMHO...

Wrt. endurance: Well, some companies demand rock solid SSDs. So Intel decided to do it right wrt. to the higher priced model ;-). If you look at the 3500er, this one would allow only 200..300MB/day writes over 5 years. Anyway, I hope, that the other chips (not only the MLCs) last that long ;-)

Wrt. HDDs - tillmo said, that these will be used basically as a scratch area for dumping temporary stuff. So I'm a little bit emotionless wrt. it. Just should be SAS, but whether it is HGST or Seagate 10K.5 or ... *.K7 isn't that important. More recent series are usually a slightly faster, have some FW bugs fixed (and have encryption, which nobody needs), but that's it. If perf is not sufficient, wrt. the current config, we have still 8 spare slots ... ;-)

corny commented 10 years ago

Thanks a lot for your knowledge and recommendation. Then we should start with at least 4x Intel DC S3500 480GB and 4x HGST Ultrastar C10K900 with 900GB.

HGST Ultrastar C10K900 with 900GB is around 300 €. Seagate Savvio 10K.5 with 900GB is around 450 €.

tillmo commented 10 years ago

we just have decided to double the RAM to 256 GB

jelmd commented 10 years ago

@tillmo summary of the lost answer wrt. cheap HDDs: WD Green and Barracuda LP HDDs are an absolut no-go! Don't feel comfortable with 5kRPM drives. Cheap (desktop targeted HDDs) often have FW issues/bugs like reporting 512B block usage but actually using 4KiB blocks internally -> terrible performance.

As an alternative one could use NearLine Storage like Seagate ST91000640SS, which is wrt. HW SATA quality and spins with 7kRPM instead of 10k, but has an SAS interface and SAS FW (~20 ¢/GB).

tillmo commented 10 years ago

good, but I think we can also afford HGST Ultrastar C10K900, since 10kRPM is better quality. I just wanted to ensure that the higher price is really justified by better quality.

nning commented 10 years ago

Just out of curiosity: From which company are you going to order the parts, @jelmd?

jelmd commented 10 years ago

Depends. If we ordered the complete machine as described, we get it from our local vendor in MD http://www.wbs-it.de/ or http://www.zstor.de/. If we assemble it by ourselves, we usually ask WBS, zstore, http://xitra.de, http://ctt.de and http://jacob-computer.de/ for the single parts. As usual, best offer wins.

jelmd commented 10 years ago

The pre-order phase is starting. Any final/last thoughts about the HW?

Wrt. CPU choices are: http://ark.intel.com/products/family/78582/Intel-Xeon-Processor-E5-v2-Family#@Server and especially http://ark.intel.com/products/75279/Intel-Xeon-Processor-E5-2690-v2-25M-Cache-3_00-GHz http://ark.intel.com/products/75273/Intel-Xeon-Processor-E5-2667-v2-25M-Cache-3_30-GHz

Since most applications today don't make much use of multi-threading, are somehow monolithic and thus benefit more from higher CPU-cycles than cores/threads aka strands, my guts say E5-2667-v2 might be better, but I don't know the ontohub environment/working set, so any hints are welcome ...

tillmo commented 10 years ago

OK, in the meeting we have agreed to take the E5-2667-v2.

jelmd commented 10 years ago

FYI: Order is going out next week.

tillmo commented 10 years ago

the server has arrived! I have seen the box today.

jelmd commented 10 years ago

We are going to put the thing into the rack for testing/install next week. Since CentOS already comes with docker support and it is beside SuSe and RH the only Linux distro supported by our archaic Uni-RZ backup software (NetBackup 7.6 =8-( ), I prefer to give CentOS a try first. Any objections?

SGraef commented 10 years ago

Which system have you planned to use for virtualization? because we want to use an ubuntu for the ontohub instances

jelmd commented 10 years ago

Well, the big picture is: docker is a framework, which utilizes LXC to build containers, which are a kind of a chroot environment (yes, this is an extremely simplified description), and thus constrained to the running OS (i.e. no virtualization wrt. OS). So since Ubuntu is Linux, those binaries should work. However, since every Linux vendor applies his own patch sets, one may run into trouble or encounter curious behavior - sooner or later -> my guts say "not a good idea". Unfortunately I haven't had a deep dive into the docker/LXC stuff yet, so can't tell for sure. In this context, I wonder, whether there is any special on unbuntu, which makes you insisting on it? I mean, it is a server. There are is no GUI/desktop environment (X11 & friends) running on it ...

tillmo commented 10 years ago

Hi @jelmd, we have discussed this issue today. Our programmers would prefer a debian-based solution. However, if you are willing to set up the CentOS environment and also deploy ontohub to CentOS (using the soon-to-be-finished wiki page https://github.com/ontohub/ontohub/wiki/Deployment), then we want to give it a try. If this should not work out for some reason, we can fallback to CentOS hosting VMs with Ubuntu as guest OS. P.S. has the missing parcel with the hardware part been found?