-
This issue concerns the dataset description at https://github.com/awslabs/open-data-registry/blob/main/datasets/software-heritage.yaml
Due to the recent rise in the quest of data for LLM training, we…
-
### Is your feature request related to a problem? Please describe.
I haven't yet found a good way to open large (exceeds RAM) remote (not on my local file system) GRIB files in xarray.
### Describe …
-
This stems from https://github.com/physiopy/physiopy.github.io/issues/11#issuecomment-753409815. I was thinking we can use this issue to build a list of datasets with physio data.
One good start is…
tsalo updated
3 years ago
-
Hi,
I was trying to create a custom tokenizer for a different language which is not included in llama 3.2 tokenizer.
I could not find exactly what tokenizer I can use from hf which is exact altern…
-
Finding good example data for use in teaching is challenging for other data-intensive domains as well as (bio)image processing. The Carpentries and the Academic Data Science Alliance are collaborating…
-
Si parlava di creare una nuova label/categoria/form per "Open Data", inziando con inserire https://github.com/vi-enne/codici_utili/tree/main/profughiUcraina
C'è da capire come iniziare a formalizza…
-
Let's build up a list of data sets that might be interesting.
* [UK Biobank](http://biorxiv.org/content/early/2017/04/24/130385)
-
Hi I'm actually learning ML in my spare time, I would like to see in this repo links to open datasets in general about our beautiful country.
Regards,
Hector F.
-
Hi team,
Ahead of the announcement of the dataset, we want to check that everything works as expected for users. **Please follow the procedure below for the datasets you contributed yourself**, an…
-
Problem Description:
Developing a new machine learning model for improving the accuracy and performance of stock price predictions. The challenge lies in modeling the highly volatile nature of stoc…