-
**Some important tangents to the distributed data question are:**
Enabling more kinds of ObjectStores - potentially useful for distributed data, especially in a multi-user Galaxy context.
- Impl…
-
This issue is more philosophical/discussion orientated than actionable (I think).
People want to use "data" on mybinder.org (or any Binder deployment) and currently we offer no particular integrati…
-
**Problem**
Whenever we reuse intermediate results and there is a pipeline breaker (such as shuffles, joins, reductions, or groupby operations), it forces us to materialize the entire intermediate …
-
Hi, I am trying to fetch FLAN v2 by running the
`PYTHONPATH=. python flan/v2/run_example.py`
I could successfully run `cot_submix`, but I faced **out of memory issue** when I was trying to fetch…
-
Found cached dataset json (/home/ub2004/.cache/huggingface/datasets/json/default-6eef2a44d8479e8f/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51)
100%|████████████████████████…
-
### Feature request
Is it possible to download only the data that I am requesting and not the entire dataset? I run out of disk spaceas it seems to download the entire dataset, instead of only the pa…
-
### Version
1
### DataCap Applicant
yann-y
### Project ID
1
### Data Owner Name
noaa
### Data Owner Country/Region
American Samoa
### Data Owner Industry
Environment
### Website
https://w…
-
### 🐛 Describe the bug
torchrun --standalone --nproc_per_node=4 ./examples/train_sft.py
--pretrain /home/**/**/text-generation-webui/models/LLaMA-7B
--model 'llama'
--strategy colossala…
-
The error is :
Traceback (most recent call last):
File "/home/yu/.jupyter/ngsv/2dtan/train_net.py", line 161, in
main()
File "/home/yu/.jupyter/ngsv/2dtan/train_net.py", line 155, in ma…
-
In the 20230830 release, there is a mismatch in the number of cells between the expression matrix and metadata for the Allen MERFISH data. Metadata has 3938808 cells, and the expression matrix has 433…