-
hey maestro maintainers! :wave:
Heads up that we wrote a script adapter (according to maestro's design) for Kubernetes - it will create a batch job and associated config map with an entrypoint to …
vsoch updated
8 months ago
-
I'm looking at using RAJA within Dyninst as a quick way to test and migrate among various forms of parallelism (cilk, openmp, tbb are the three of interest right now, as well as retaining serial capab…
-
### What happened?
The program crashed while using PySR, with an error message indicating a memory access violation (EXCEPTION_ACCESS_VIOLATION). This error occurred during the garbage collection pro…
-
## Issue description
Unable to use MPI rendezvous in Caffe2.
I understand that this information may not be sufficient for helping me out. Hence, I request you to ask to perform whatever steps that…
-
Right now I think I'm required to put a flux job id for a job dependency. I'd ideally like to be able to do this:
```bash
flux submit --job-name task-1 -N 1 bash -c echo Starting task 1; sleep 3; …
vsoch updated
5 months ago
-
Hi! I'm looking at the tutorial here: https://github.com/CrossFacilityWorkflows/DOE-HPC-workflow-training/tree/main/Balsam and trying to imagine how this works with a job manager like Flux Framework. …
-
## Bug Report
@csiefer2 @nchaimov
### Description
Building `trilinos@master +testing +rocm amdgpu_target=gfx90a +amesos +amesos2 +anasazi +aztec +belos +boost +epetra +epetraext +ifpack +ifpack2…
-
1. I want to do incremental pre-training on the existing RoBERTa. Which RoBERTa model should I use? Download directly from Hugging Face? Do I need to script it into UER format after downloading it? Is…
-
We are interested in training nequip potentials on large datasets of several million structures.
Consequently we wanted to know whether multi-gpu support exists or if someone knows whether the networ…
-
### Bug description
When trying to train on two GPUs in a jupyter notebooks environment on jarvislabs.ai with `ddp_notebooks` I get the following error "MisconfigurationException: No supported gpu ba…