-
Because Clowder allows multiple datasets to have the same name, we've frequently run into a problem with duplicate datasets during the Globus upload process. This results in downstream problems, such …
-
**Describe the feature you'd like**
Need the ability to use @ remote to train on a multi-instance node for distributed training.
**How would this feature be used? Please describe.**
Distributed t…
-
**Describe the feature:**
Add the ability to join two or more data tables or saved searches for visualization purposes.
This feature won't affect the queries being run but will add another layer to …
-
**Describe the bug**
The sequence length during training is different than specified, in the configs, I've specified seq-len 50016, which is divisible by the tensor-model-parallel-size=4, however, du…
-
Thank you for your great works! I meet this problem when i train the model with hmdb_51 dataset:
```
[2023-09-17 14:18:47 ViT-B/16](main.py 181): INFO Train: [0/50][0/3383] eta 0:49:54 lr 0.00000000…
-
Hi, in the paper you mentioned that you trained on 3 datasets:
- VGGFace2: ~3,150,000 images
- BUPT-Balancedface = ~1,250,000 images
- VoxCeleb2 = 145k videos * 25 frames per second * ~6 seconds…
-
Recently, i use `open_mfdataset` to open a local tar.gz file of multiple netcdf files,
it failed to open it and raise a `distributed.scheduler.KilledWorker: Error` and
`TypeError: cannot serialize…
-
**Describe the bug**
I try to finetune `llama3-8B` model with multi nodes but get an AtrributeError when finishing loading mcore format checkpoint and starting to build datasets, the error is below:
…
-
Hi,
I am trying to run inference with `llama2+13b` and I have 4 RTX3090 each with 24GB Memory, however I noticed that when I use the sample inference code, it only uses one GPU which causes out of …
-
I ran the following command (from the [README](https://github.com/microsoft/scene_graph_benchmark#vinvl-feature-extraction)):
```
python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl…