tomeichlersmith / umn-cluster

Notes and code for administering the UMN SPA Computing Cluster
https://tomeichlersmith.github.io/umn-cluster/
0 stars 2 forks source link

Open Science Grid #4

Closed tomeichlersmith closed 2 years ago

tomeichlersmith commented 2 years ago

Reach out to Bryan Lim at UW to talk about Condor and OSG.

tomeichlersmith commented 2 years ago

Meeting Notes

May 16, 2022

Attendees:

Chad Intro

Focus on simple solution that is long term supportable.

What can CSE-IT provide?

Get additional storage from CSE-IT like home directories?

Usable size of Hadoop?

Disposable compute nodes?

As hardware dies we can get rid of it.

In number of cores, the cluster is comparable to Europe Tier 2 sites. Storage capacity is the biggest issue - Hadoop is dying without new hardware.

First Order

Workflow Control

Hardware Rec

Upgraded head-node would be very helpful.

Storage Element Only

Allows remote jobs to write to our area.

Needs

Questions

Chad's Reading of Docs

The first diagram on this page is a good top-level view. https://opensciencegrid.org/docs/compute-element/hosted-ce/

I'm looking over the Hosted CE requirements. Please review these:

Other pieces OSG Repository available to all nodes - Easy to set with Puppet EPEL Repository for Singularity - Already set in Puppet CVMFS - Already set in Puppet It looks like the worker node containers are not to configure the worker nodes but to send jobs to the queue as a container.

tomeichlersmith commented 2 years ago

traydock-osg module in puppet