Closed kathy-phet closed 1 year ago
Here is the quote for the recent phet-server2 replacement. Do we want to go with something like this?
Here is the machine on dell.com, where we can see pricing for various memory/cpu/storage options: https://www.dell.com/en-us/shop/cty/pdp/spd/poweredge-r440/pe_r440_tm_vi_vp_sb. The pricing will be slightly different as we'll purchase it through CU Marketplace, which I believe has some discounts but I'm not aware of the details on that. Maybe @oliver-phet could help.
Should we schedule a meeting to discuss internally? Or should I reach out to Jason to schedule a meeting with him to discuss (with @zepumph and optionally @jonathanolson)?
I'm curious about what memory is right. 8GB felt a bit low, but I'm not really sure:
Or is that 8GB per core?
I also didn't really understand the "best practices" notes and if we were doing things correctly.
That's 8GB per memory stick in the screen shot, you can hit the + and add more sticks. On phet-server2 we have 8x8GB configured.
Bayes stats:
Memory: 256GB (looks like 8x32GB)
CPU: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz -- 40 threads (looks like 2 cpus x 10 cores/cpu x 2 threads/core)
Storage: Looks like we need at least 11TB plus room to grow? vgs
reports how much is physically available, df -h
reports how much is allocated and used.
[mape5853@bayes ~]$ sudo vgs
VG #PV #LV #SN Attr VSize VFree
os 1 5 0 wz--n- 10.91t 3.38t
[mape5853@bayes ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 126G 0 126G 0% /dev
tmpfs 126G 14G 112G 12% /dev/shm
tmpfs 126G 131M 126G 1% /run
tmpfs 126G 0 126G 0% /sys/fs/cgroup
/dev/mapper/os-root 10G 6.5G 3.6G 65% /
/dev/sda2 477M 164M 285M 37% /boot
/dev/mapper/os-home 15G 11G 4.7G 69% /home
/dev/mapper/os-var 6.0G 3.7G 2.4G 62% /var
/dev/mapper/os-data 7.5T 4.9T 2.6T 66% /data
tmpfs 26G 0 26G 0% /run/user/17931
tmpfs 26G 0 26G 0% /run/user/380503
tmpfs 26G 0 26G 0% /run/user/454144
tmpfs 26G 0 26G 0% /run/user/584799
tmpfs 26G 0 26G 0% /run/user/451065
This should be a good starting point for a conversation with Jason.
Using an R440 poweredge, to just match these specs would cost $25,389.82.
@jonathanolson @zepumph - do we know what our limiting factors are for CT? What resources should we prioritize increasing?
According to https://bayes.colorado.edu/xymon/ (creds are in the doc), it almost always reports that bayes.colorado.edu is maxed out on CPU. However, that report isn't always measuring things correctly, so we should use it as a guide to investigate and not final answers.
Thanks, Matt. For getting this issue rolling. Maybe one 30 minute meeting internally with you, JO, MK. And then a meeting with Jason.
Also @oliver-phet - Do we have the original bayes machine quote from CU OIT? Can you attach it here.
Also @oliver-phet - Do we have the original bayes machine quote from CU OIT? Can you attach it here.
Is this what you were asking for? The 2015 machine specs? Dell912687874.pdf
We have set up a meeting for this afternoon and will report back.
Configuration that we agreed upon in a meeting with @jonathanolson @oliver-phet @kathy-phet @zepumph and @mattpen
@oliver-phet pointed out that the "max" memory slots per CPU for this processor is 6:
I sent an email to Jason requesting a meeting time.
(Maybe there isn't anything we can do about this) but the processor we have selected is a "2nd Gen" and Intel launched their "3rd Gen" Xeon Scalable processors in Q2'21. https://www.intel.com/content/www/us/en/products/details/processors/xeon/scalable/gold/products.html
We'd probably have to select a different motherboard or housing I would imagine. I think Jason could help us with this if we're interested in using 3rd gen instead of 2nd.
Looks like the 6230 (what we were looking at with 40T per core) was released Q2'19
A different PowerEdge has those newer options: https://www.dell.com/en-us/shop/servers-storage-and-networking/poweredge-r550-rack-server/spd/poweredge-r550/pe_r550_tm_vi_vp_sb
this has this one: Intel® Xeon® Gold 5318Y 2.1G, 24C/48T, 11.2GT/s, 36M Cache, Turbo, HT (165W) DDR4-2933
This one has Platinum processor options with crazy number of threads https://www.dell.com/en-us/shop/servers-storage-and-networking/poweredge-r650-rack-server/spd/poweredge-r650/pe_r650_14796_vi_vp
I got pretty far on this one in the r650, but realized they don't have an option for HDDs (only SSD), so that added ~5000 to the final price:
650 is showing SAS HD options for me?
Bonus of r650 is its in stock instead of out of stock. https://www.dell.com/en-us/shop/servers-storage-and-networking/poweredge-r650-rack-server/spd/poweredge-r650/pe_r650_14796_vi_vp?configurationid=fd2911a3-0ef7-4116-ae2b-1f08c6b9bb1b With platenium 64T cores ... but maybe not everything we need.
Clicking around a bit more... I think the R440 chassis was limiting our CPU choices. This R450 dual CPU rack allows 3rd gen processors: https://www.dell.com/en-us/shop/servers-storage-and-networking/poweredge-r450-rack-server/spd/poweredge-r450/pe_r450_15127_vi_vp?view=configurations&configurationid=782abc84-b21c-4b11-baf8-39b7219aa6c2
Questions I have for our next meeting:
I recently had a good experience using GitHub Codespaces while investigating https://github.com/phetsims/chipper/issues/1353 and it seemed like it might also work for CT since you can clone many repos, install things and run programs, including web servers. Running an unbuilt sim over port forwarding didn't work, but it's unclear whether that would affect CT self-loading (no port forwarding). Of course it would be possible to run into other incompatibility problems. We previously (a few years ago) determined that cloud hosting would be too expensive (AWS, I believe), but now that Codespaces came out I wanted to double check that quote.
The pricing is listed at https://docs.github.com/en/billing/managing-billing-for-github-codespaces/about-billing-for-github-codespaces, which shows that 1 hour on a 32 core machine is $2.88. A month has 730 hours, so that works out to $2100/month, which sounds pretty expensive. To match the phet-server-2 quote above, the break-even point would be at $12,125.34/$2100, which is only 5.7 months. I don't think we would want to do something like this without a breakeven point at 4+ years. But to get to that price, we would have to sacrifice cores or not run it 24/7. Storage is quoted at $0.07 per 1GB/month, and I did not count it in the calculation. Likewise I did not check AWS to see how their prices compare today.
Anyways, just wanted to jot down a paper trail in case someone (like future me) asks about cloud computing.
---------- Forwarded message --------- From: Jason Edward Hill jason@colorado.edu Date: Tue, Nov 8, 2022 at 1:51 PM Subject: Re: Purchase advice on a new PhET server for a testing machine (re BAYES recent failure) To: Kathy Perkins katherine.perkins@colorado.edu, Matthew Pennington Matthew.Pennington@colorado.edu Cc: Jonathan B Olson jonathan.olson@colorado.edu, Michael J Kauzmann Michael.Kauzmann@colorado.edu, Oliver Pascal Nix Oliver.Nix@colorado.edu
Hi all,
Again, my apologies for the delay on this. I've attached a quote. It includes all of what you specified, and I added the 2-port SFP28 NIC, got rid of the power cords, added a ready-rails mount without cable management.
Please look it over and let me know what you think.
Cheers, Jason
@kathy-phet would like us all to review Jason's comments in his latest email and the quote he provided (included above), and either comment with changes that should be made or approve the purchase
I went through the quote process and generated an identical server with the same price, looks good to me.
I sent Jason an email asking if we should run this by the Dell Rep.
The server has been purchased and acquired. Next steps are in https://github.com/phetsims/special-ops/issues/234.
The goal for this is to have something ordered by July.
It isn't clear to me what the specs are on this. Let's discuss this further!