metaseq Search Results - Githubissues

287 results
for metaseq

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

facebookresearch/metaseq #10

Defining arch from args/checkpoint/default?

Given https://github.com/fairinternal/metaseq-internal/pull/181 , it seems like arch is not necessarily present in args in the training workflow when loading from disk. This seems like yet another ca…

suchenzang updated 2 years ago
3
facebookresearch/metaseq #98

Add OPT to huggingface conversion guidelines/scripts

## 🚀 Feature Request Add documentation for converting fine-tuned OPT models to huggingface ### Motivation HuggingFace [added OPT to their suite which is a great win for the community!](https://…

tbmihailov updated 1 year ago
17
facebookresearch/metaseq #319

Host logbooks outside of Github

There are currently two logbooks checked into the OPT project directory: https://github.com/facebookresearch/metaseq/tree/main/projects/OPT/chronicles which were placed there for easy access on releas…

suchenzang updated 2 years ago
2
kokitsuyuzaki/metaSeq #2

Methodology query

Hi @kokitsuyuzaki, Thanks for making this nice R package. Recently, I have been trying to understand the different steps in the method used in metaSeq package as I want to use this tool for an ana…

bpxl22 updated 3 years ago
3
facebookresearch/metaseq #617

Remove apex dependency after PyTorch 2.0?

This is to look into whether or not we still need to use apex for speedups if out-of-the-box PyTorch 2.0 may "just work". Will require benchmarking at a few different scales to confirm.

suchenzang updated 1 year ago
3
facebookresearch/metaseq #616

Remove Megatron dependency - move entirely to Fairscale

This is to look into whether or not we can remove our Megatron dependency and rely entirely on our Fairscale dependency (model parallelism implementation seems to be identical between the two).

suchenzang updated 1 year ago
1
facebookresearch/fairseq #4403

Can the fairseq-13b model be used commercially? Which licens…

Hi, Regarding the models listed here https://github.com/pytorch/fairseq/tree/main/examples/moe_lm the model card and NOTE file included with the models say "Models are intended for research use onl…

timohear updated 2 years ago
2
facebookresearch/metaseq #544

Restarts from restore-file does not work if restart occurs b…

Right now, if we were to launch a job with --restore-file **and** experience a job restart shortly thereafter (before a new checkpoint is written), we fail to resume again from the --restore-file and …

suchenzang updated 1 year ago
1
facebookresearch/metaseq #327

logging every step for 175B model is roughly costing 1 secon…

Most likely because we loop over bunch of tensors (gradients / activations norms etc) and move them to cpu for logging. Weirdly this happens outside of WPS and UPS counters, so we were not noticing…

ngoyal2707 updated 1 year ago
2
facebookresearch/metaseq #135

Convert opt to megatron-lm

## 🚀 Feature Request Convert opt checkpoint to megatron-lm or fastertransformer ### Motivation I am currently trying to use opt in a production environment. However, because the 175B model is …

appleeji updated 2 years ago
3

上一页 1...3 4 5 6 7 8 9...29 下一页

287 results for metaseq

287 results
for metaseq