Closed ThomasSURF closed 4 months ago
Thanks for using our repo to benchmark your models. From your table, looks like the average of your LLaMA-2 7b chat
and Mistral 7B Instruct v0.2
results are 86.2% and 93.3% respectively. I think is comparable to our reported results.
For the pip freeze, do you face any issue when you use our provided container?
Ah you are completely right, it's only 13 tasks instead of 14 so indeed the results are correct. My bad 😄
The Docker build cannot directly be run on HPC clusters as they will only allow fake root container applications like Singularity/Apptainer. Thus, I can only recreate the environment from local Python virtual environment or converting partially the Docker build file to the Apptainer.
There, I have encountered both on both A100 and H100 with CUDA 12.3 the following errors
undefined symbol: _ZN2at4_ops9_pad_enum4callERKNS_6TensorEN3c108ArrayRefINS5_6SymIntEEElNS5_8optionalIdEE
https://github.com/Dao-AILab/flash-attention/issues/836
In the end, flash-attn==2.5.7 is the newest version which still works given the torch versions necessary for vllm and without the TransformerEngine library.
On top, the Phi-3 128k model with pip install of vllm==0.4.0.post1 (but also newer versions) have the the following error
venv/lib/python3.11/site-packages/vllm/config.py", line 816, in _get_and_verify_max_len assert "factor" in rope_scaling
. Did you install vllm from source or adjusted the RoPE code?
Thanks anyway because I am still able to run the confirm the results up until Phi-3 128k
Regarding FA issue, I sometimes face undefined symbol
error when I install it. I will try to build it from the source. Following is the pip freeze of my docker container. Hope this helps.
For Phi-3 128k, I inference with Huggingface framework instead of vLLM since vLLM has not fully supported Phi-3 if I remember.
absl-py==2.0.0
accelerate==0.29.1
addict==2.4.0
aiohttp @ file:///rapids/aiohttp-3.8.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=df72ac063b97837a80d80dec8d54c241af059cc9bb42c4de68bd5b61ceb37caa
aiosignal @ file:///rapids/aiosignal-1.3.1-py3-none-any.whl#sha256=f8376fb07dd1e86a584e4fcdec80b36b7f81aac666ebc724e2c090300dd83b17
alabaster==0.7.16
aniso8601==9.0.1
annotated-types==0.5.0
antlr4-python3-runtime==4.9.3
anyio==4.3.0
apex @ file:///opt/pytorch/apex
appdirs==1.4.4
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
asciitree==0.3.3
asttokens==2.4.0
astunparse==1.6.3
async-timeout @ file:///rapids/async_timeout-4.0.3-py3-none-any.whl#sha256=7405140ff1230c310e51dc27b3145b9092d659ce68ff733fb0cefe3ee42be028
attrdict==2.0.1
attrs==23.1.0
audioread==3.0.1
Babel==2.14.0
backcall==0.2.0
bcrypt==4.1.2
beautifulsoup4==4.12.2
black==19.10b0
bleach==6.0.0
blis==0.7.11
boto3==1.34.79
botocore==1.34.79
braceexpand==0.1.7
Brotli==1.1.0
cachetools==5.3.1
catalogue==2.0.10
causal-conv1d==1.2.0.post2
cdifflib==1.2.6
certifi==2023.7.22
cffi==1.16.0
charset-normalizer @ file:///rapids/charset_normalizer-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=193cbc708ea3aca45e7221ae58f0fd63f933753a9bfb498a3b474878f12caaad
click==8.0.2
clip==0.2.0
cloudpathlib==0.15.1
cloudpickle @ file:///rapids/cloudpickle-2.2.1-py3-none-any.whl#sha256=61f594d1f4c295fa5cd9014ceb3a1fc4a70b0de1164b94fbc2d854ccba056f9f
cmake==3.27.6
colorama==0.4.6
comm==0.1.4
confection==0.1.3
contourpy==1.1.1
cryptography==42.0.5
cubinlinker @ file:///rapids/cubinlinker-0.3.0%2B2.gce0680b-cp310-cp310-linux_x86_64.whl#sha256=8cff93be2d63d7db8f1d15fc72cf813abe3d8fd31c35be439e3fb6b7b4c89f76
cuda-python @ file:///rapids/cuda_python-12.2.0rc5%2B5.g84845d1-cp310-cp310-linux_x86_64.whl#sha256=19bb8c6dd62e976182ff183aab18d2c9f0a698add93a1037f2cbaa5d0f739d9d
cudf @ file:///rapids/cudf-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=12228d0949a6be3a7a383262f77c37372d48e02e57c4d0b8ed3763ced4d26ccb
cugraph @ file:///rapids/cugraph-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=209757e66f1ef51a5bace52774f9fc5575cdc6a00e11287ca8f0be78f57a9661
cugraph-dgl @ file:///rapids/cugraph_dgl-23.8.0-py3-none-any.whl#sha256=ef49cc4464b39aa686b97faa50186bd104cf965a7b7215c7ffb7b94011b6bcea
cugraph-service-client @ file:///rapids/cugraph_service_client-23.8.0-py3-none-any.whl#sha256=54d3f0367285be37ed4166483e4402e71e6a4747fb55e5a32a6ca9abfe264cb5
cugraph-service-server @ file:///rapids/cugraph_service_server-23.8.0-py3-none-any.whl#sha256=1fd5d70166ff9023c2b451f63e1a4a25c0e55e018811fc1549f52dffb7a422f6
cuml @ file:///rapids/cuml-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=f9209e5d1e2c765a4bc0b2955e4bc29016b9c4186b7e0512553f3fff879bf697
cupy-cuda12x @ file:///rapids/cupy_cuda12x-12.1.0-cp310-cp310-linux_x86_64.whl#sha256=840d1f4560436be5aaa9b6071d4947a391ab8c7b4810f035fc7815d43c29ed6d
cycler==0.12.1
cymem==2.0.8
Cython==3.0.3
cytoolz==0.12.3
dask @ file:///rapids/dask-2023.7.1-py3-none-any.whl#sha256=8ca3969805dd1cceee66f1138f103fba6fbaf22ba488f15b2382b4579ee39f02
dask-cuda @ file:///rapids/dask_cuda-23.8.0-py3-none-any.whl#sha256=68d2bef0df1307a28a0306e3501d63e6d19994d8bbe5e5dccd8b0967bcca8d30
dask-cudf @ file:///rapids/dask_cudf-23.8.0-py3-none-any.whl#sha256=8783c9089041462b8a4418d8645db2a7b2bc32c4c4b1800512f387d466ee1f16
datasets==2.18.0
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
diffusers==0.27.2
dill==0.3.8
diskcache==5.6.3
Distance==0.1.3
distributed @ file:///rapids/distributed-2023.7.1-py3-none-any.whl#sha256=1237f8ae11baa9f80070329a33f9d5af32da5c272a98bab088c9b0578c2d816e
distro==1.9.0
dm-tree==0.1.8
docker-pycreds==0.4.0
docopt==0.6.2
docutils==0.20.1
editdistance==0.8.1
einops==0.7.0
einops-exts==0.0.4
exceptiongroup==1.1.3
execnet==2.0.2
executing==2.0.0
expecttest==0.1.3
faiss-cpu==1.8.0
fastapi==0.110.1
fasteners==0.19
fastjsonschema==2.18.1
fastrlock @ file:///rapids/fastrlock-0.8.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_24_x86_64.whl#sha256=d6c53abeae3f9a55b5c65824cec9df59159fa50e8fa800a5c6e8de42b2219c28
fasttext==0.9.2
filelock==3.12.4
flash-attn==2.4.2
Flask==2.2.5
Flask-RESTful==0.3.10
fonttools==4.43.1
frozenlist @ file:///rapids/frozenlist-1.4.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=6918d49b1f90821e93069682c06ffde41829c346c66b721e65a5c62b4bab0300
fsspec @ file:///rapids/fsspec-2023.6.0-py3-none-any.whl#sha256=1cbad1faef3e391fba6dc005ae9b5bdcbf43005c9167ce78c915549c352c869a
ftfy==6.2.0
furl==2.1.3
future==1.0.0
g2p-en==2.1.0
gast==0.5.4
gdown==5.1.0
gevent==24.2.1
geventhttpclient==2.0.2
gitdb==4.0.11
GitPython==3.1.43
google-ai-generativelanguage==0.6.2
google-api-core==2.18.0
google-api-python-client==2.126.0
google-auth==2.23.2
google-auth-httplib2==0.2.0
google-auth-oauthlib==0.4.6
google-generativeai==0.5.1
googleapis-common-protos==1.63.0
graphsurgeon @ file:///workspace/TensorRT-8.6.1.6/graphsurgeon/graphsurgeon-0.4.6-py2.py3-none-any.whl#sha256=0fbadaefbbe6e9920b9f814ae961c4a279be602812edf3ed7fb9cc6f8f4809fe
greenlet==3.0.3
grpcio==1.62.2
grpcio-status==1.62.2
h11==0.14.0
h5py==3.10.0
html2text==2024.2.26
httpcore==1.0.5
httplib2==0.22.0
httptools==0.6.1
httpx==0.27.0
huggingface-hub==0.22.2
hydra-core==1.3.2
hypothesis==5.35.1
idna==3.4
ijson==3.2.3
imageio==2.34.0
imagesize==1.4.1
importlib-metadata @ file:///rapids/importlib_metadata-6.8.0-py3-none-any.whl#sha256=3ebb78df84a805d7698245025b975d9d67053cd94c79245ba4b3eb694abe68bb
inflect==7.2.0
iniconfig==2.0.0
intel-openmp==2021.4.0
interegular==0.3.3
intervaltree==3.1.0
ipykernel==6.25.2
ipython==8.16.1
ipython-genutils==0.2.0
ipywidgets==8.1.2
isort==5.13.2
itsdangerous==2.1.2
jedi==0.19.1
jieba==0.42.1
Jinja2==3.1.2
jiwer==2.5.2
jmespath==1.0.1
joblib==1.3.2
json5==0.9.14
jsonschema==4.19.1
jsonschema-specifications==2023.7.1
jupyter-tensorboard @ git+https://github.com/cliffwoolley/jupyter_tensorboard.git@ffa7e26138b82549453306e06b535a9ac36db17a
jupyter_client==8.3.1
jupyter_core==5.3.2
jupyterlab==2.3.2
jupyterlab-pygments==0.2.2
jupyterlab-server==1.2.0
jupyterlab_widgets==3.0.10
jupytext==1.15.2
kaldi-python-io==1.2.2
kaldiio==2.18.0
kiwisolver==1.4.5
kornia==0.7.2
kornia_rs==0.1.3
langcodes==3.3.0
lark==1.1.9
latexcodec==3.0.0
lazy_loader==0.4
Levenshtein==0.22.0
lhotse==1.22.0
librosa==0.10.1
lightning-utilities==0.11.2
lilcom==1.7
llvmlite @ file:///rapids/llvmlite-0.40.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=bbd5e82cc990e5a3e343a3bf855c26fdfe3bfae55225f00efd01c05bbda79918
locket @ file:///rapids/locket-1.0.0-py2.py3-none-any.whl#sha256=b6c819a722f7b6bd955b80781788e4a66a55628b858d347536b7e81325a3a5e3
loguru==0.7.2
lxml==5.2.1
mamba-ssm==1.2.0.post1
Markdown==3.4.4
markdown-it-py==3.0.0
markdown2==2.4.13
MarkupSafe==2.1.3
marshmallow==3.21.1
matplotlib==3.8.0
matplotlib-inline==0.1.6
mdit-py-plugins==0.4.0
mdurl==0.1.2
megatron_core==0.5.0
mistune==3.0.2
mkl==2021.1.1
mkl-devel==2021.1.1
mkl-include==2021.1.1
mock==5.1.0
more-itertools==10.2.0
mpmath==1.3.0
msgpack @ file:///rapids/msgpack-1.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=e42b9594cc3bf4d838d67d6ed62b9e59e201862a25e9a157019e171fbe672dd3
multidict @ file:///rapids/multidict-6.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=36c63aaa167f6c6b04ef2c85704e93af16c11d20de1d133e39de6a0e84582a93
multiprocess==0.70.16
murmurhash==1.0.10
nbclient==0.8.0
nbconvert==7.9.2
nbformat==5.9.2
nemo_text_processing==0.3.0rc0
nemo_toolkit==1.23.0
nerfacc==0.5.3
nest-asyncio==1.5.8
networkx==2.6.3
ninja==1.11.1.1
nltk==3.8.1
notebook==6.4.10
numba @ file:///rapids/numba-0.57.1%2B1.g5fba9aa8f-cp310-cp310-linux_x86_64.whl#sha256=348d18dbb5ce363133fa7d033ae804b5440bf51778395f08b337a9ca6ac98e53
numcodecs==0.12.1
numpy==1.24.4
nvfuser==0.0.20+gitunknown
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-dali-cuda120==1.30.0
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.1.105
nvidia-pyindex==1.0.9
nvtx @ file:///rapids/nvtx-0.2.5-cp310-cp310-linux_x86_64.whl#sha256=b8024910cace4d07e6c9677eaf3be1b3e626fa1923ec6e3c7e5d3fdca053c9c9
oauthlib==3.2.2
omegaconf==2.3.0
onnx @ file:///opt/pytorch/pytorch/third_party/onnx
open-clip-torch==2.24.0
openai==1.16.2
OpenCC==1.1.6
opencv @ file:///opencv-4.7.0/modules/python/package
orderedmultidict==1.0.1
outlines==0.0.34
packaging @ file:///rapids/packaging-23.1-py3-none-any.whl#sha256=994793af429502c4ea2ebf6bf664629d07c1a9fe974af92966e4b8d2df7edc61
pandas @ file:///rapids/pandas-1.5.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=7a0a56cef15fd1586726dace5616db75ebcfec9179a3a55e78f72c5639fa2a23
pandocfilters==1.5.0
pangu==4.0.6.1
parameterized==0.9.0
paramiko==3.4.0
parso==0.8.3
partd @ file:///rapids/partd-1.4.0-py3-none-any.whl#sha256=7a63529348cf0dff14b986db641cd1b83c16b5cb9fc647c2851779db03282ef8
pathspec==0.12.1
pathy==0.10.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow @ file:///tmp/pillow-simd
plac==1.4.3
platformdirs==3.11.0
pluggy==1.3.0
ply @ file:///rapids/ply-3.11-py2.py3-none-any.whl#sha256=096f9b8350b65ebd2fd1346b12452efe5b9607f7482813ffca50c22722a807ce
polygraphy==0.49.0
pooch==1.7.0
portalocker==2.8.2
preshed==3.0.9
prettytable==3.9.0
progress==1.6
prometheus_client==0.20.0
prompt-toolkit==3.0.39
proto-plus==1.23.0
protobuf==4.24.4
psutil @ file:///rapids/psutil-5.9.4-cp310-abi3-linux_x86_64.whl#sha256=e711cfad802fd4061d559d17e9f175e866551434c3418af2925881a3e5f3440e
ptxcompiler @ file:///rapids/ptxcompiler-0.8.1%2B1.g2cb1b35-cp310-cp310-linux_x86_64.whl#sha256=461049ad74511c8d923967e1826861a0d9a2bcee0cfcf3ebc338fc48b3ecc724
ptyprocess==0.7.0
pure-eval==0.2.2
py-cpuinfo==9.0.0
pyannote.core==5.0.0
pyannote.database==5.1.0
pyannote.metrics==3.2.1
pyarrow==15.0.2
pyarrow-hotfix==0.6
pyasn1==0.5.0
pyasn1-modules==0.3.0
pybind11==2.11.1
pybind11-global==2.11.1
pybtex==0.24.0
pybtex-docutils==1.0.3
pycocotools @ git+https://github.com/nvidia/cocoapi.git@fa44301f7a8b3f95a9f2751d19bfd735b0f6c65d#subdirectory=PythonAPI
pycparser==2.21
pydantic==2.4.2
pydantic_core==2.10.1
pydub==0.25.1
Pygments==2.16.1
pylibcugraph @ file:///rapids/pylibcugraph-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=8327053f864ed56bf0d0d8fb69a2291ca1e044fa1f447e63b85b29bf72102c74
pylibcugraphops @ file:///rapids/pylibcugraphops-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=17364a79cda63c9f6c62ef6f2bd37151a9e70539f6d60e43fb26ab40e163bba2
pylibraft @ file:///rapids/pylibraft-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=f74580fec4d0e1603f9b3027da33d915ce07a37d2790c28b1d784d133e90a6d2
pyloudnorm==0.1.1
PyMCubes==0.1.4
PyNaCl==1.5.0
pynini==2.1.5
pynvml==11.5.0
pyparsing==3.1.1
pypinyin==0.51.0
pypinyin-dict==0.8.0
PySocks==1.7.1
pytest==7.4.2
pytest-flakefinder==1.1.0
pytest-rerunfailures==12.0
pytest-runner==6.0.1
pytest-shard==0.1.2
pytest-xdist==3.3.1
python-dateutil==2.8.2
python-dotenv==1.0.1
python-hostlist==1.23.0
python-rapidjson==1.16
pytorch-lightning==2.0.7
pytorch-quantization==2.1.2
pytz @ file:///rapids/pytz-2023.3-py2.py3-none-any.whl#sha256=a151b3abb88eda1d4e34a9814df37de2a80e301e68ba0fd856fb9b46bfbbbffb
PyYAML==6.0.1
pyzmq==25.1.1
raft-dask @ file:///rapids/raft_dask-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=9464bd2889aff217d63f2ff804f06328123119e72745399900315fc85f4d6b7e
rapidfuzz==2.13.7
ray==2.10.0
referencing==0.30.2
regex==2023.10.3
requests==2.31.0
requests-oauthlib==1.3.1
resampy==0.4.2
rich==13.7.1
rmm @ file:///rapids/rmm-23.8.0-cp310-cp310-linux_x86_64.whl#sha256=11e3bc42ddfa51f8293ddb37fb006e4dd59fc20534e8f027b5453c8d00fa089f
rotary-emb @ git+https://github.com/HazyResearch/flash-attention.git@f692b98d805850983f14deec7a9104583c58b107#subdirectory=csrc/rotary
rouge-score==0.1.2
rpds-py==0.10.4
rsa==4.9
ruamel.yaml==0.18.6
ruamel.yaml.clib==0.2.8
s3transfer==0.10.1
sacrebleu==2.4.1
sacremoses==0.1.1
safetensors==0.4.2
scikit-learn @ file:///rapids/scikit_learn-1.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=184a42842a4e698ffa4d849b6019de50a77a0aa24d26afa28fa49c9190bb144b
scipy @ file:///rapids/scipy-1.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=366a6a937110d80dca4f63b3f5b00cc89d36f678b2d124a01067b154e692bab1
Send2Trash==1.8.2
sentence-transformers==2.6.1
sentencepiece==0.2.0
sentry-sdk==1.44.1
setproctitle==1.3.3
shellingham==1.5.4
six==1.16.0
smart-open==6.4.0
smmap==5.0.1
sniffio==1.3.1
snowballstemmer==2.2.0
sortedcontainers==2.4.0
soundfile==0.12.1
soupsieve==2.5
sox==1.5.0
soxr==0.3.7
spacy==3.7.1
spacy-legacy==3.0.12
spacy-loggers==1.0.5
Sphinx==7.2.6
sphinx-glpi-theme==0.3
sphinxcontrib-applehelp==1.0.8
sphinxcontrib-bibtex==2.6.2
sphinxcontrib-devhelp==1.0.6
sphinxcontrib-htmlhelp==2.0.5
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.7
sphinxcontrib-serializinghtml==1.1.10
srsly==2.4.8
sshtunnel==0.4.0
sshtunnel-requests==0.1.3
stack-data==0.6.3
starlette==0.37.2
sympy==1.12
tabulate==0.9.0
taming-transformers==0.0.1
tbb==2021.10.0
tblib @ file:///rapids/tblib-2.0.0-py3-none-any.whl#sha256=9100bfa016b047d5b980d66e7efed952fbd20bd85b56110aaf473cb97d18709a
tenacity==8.2.3
tensorboard==2.9.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorrt @ file:///workspace/TensorRT-8.6.1.6/python/tensorrt-8.6.1-cp310-none-linux_x86_64.whl#sha256=2684b4772cb16088184266728a0668f5dac14e66f088c4ccff2096ccb222d74c
tensorstore==0.1.45
termcolor==2.4.0
terminado==0.17.1
text-unidecode==1.3
textdistance==4.6.1
texterrors==0.4.4
thinc==8.2.1
threadpoolctl==3.2.0
thriftpy2 @ file:///rapids/thriftpy2-0.4.16-cp310-cp310-linux_x86_64.whl#sha256=3b41ffe57f0a10ee592e06b4843e37ae1bc7f0309a2478f0bf1368ede2ad4ed4
tiktoken==0.6.0
timm==0.9.16
tinycss2==1.2.1
tokenizers==0.15.2
toml==0.10.2
tomli==2.0.1
toolz @ file:///rapids/toolz-0.12.0-py3-none-any.whl#sha256=2059bd4148deb1884bb0eb770a3cde70e7f954cfbbdc2285f1f2de01fd21eb6f
torch==2.1.2
torch-tensorrt @ file:///opt/pytorch/torch_tensorrt/dist/torch_tensorrt-0.0.0-cp310-cp310-linux_x86_64.whl#sha256=239cc59958283c8fd764ec360b93adf63db94d231c6dbae3212736187d1c1f21
torchdata @ file:///opt/pytorch/data
torchdiffeq==0.2.3
torchmetrics==1.3.2
torchsde==0.2.6
torchtext @ file:///opt/pytorch/text
torchvision==0.16.2
tornado==6.3.3
tqdm==4.66.1
traitlets==5.9.0
trampoline==0.1.2
transformer-engine @ git+https://github.com/NVIDIA/TransformerEngine.git@0fbc76af3733ae997394eaf82b78ff9c0498fe91
transformers==4.39.3
treelite @ file:///rapids/treelite-3.2.0-cp310-cp310-linux_x86_64.whl#sha256=7627a3fed44ce1dda4c35ce707cca4b6108d74a661997c0451be59d03f2155ca
treelite-runtime @ file:///rapids/treelite_runtime-3.2.0-cp310-cp310-linux_x86_64.whl#sha256=085ec1ba71007d357ecebb493c490133c20778cd51d8662a0a10d1dc56b1623e
trimesh==4.2.4
triton @ file:///tmp/dist/triton-2.1.0%2Be621604-cp310-cp310-linux_x86_64.whl#sha256=86f1f780205ac37c236306b5902cb3302c09091058b90902e1d06890ad87a6d9
tritonclient==2.44.0
typed-ast==1.5.5
typeguard==4.2.1
typer==0.12.1
types-dataclasses==0.6.6
typing_extensions==4.11.0
ucx-py @ file:///rapids/ucx_py-0.33.0-cp310-cp310-linux_x86_64.whl#sha256=55d9f5f80627ba1f00577fca41ecd6ab8c72cc518e392a078d108b7dbd809c1e
uff @ file:///workspace/TensorRT-8.6.1.6/uff/uff-0.6.9-py2.py3-none-any.whl#sha256=618a3f812d491f0d3c4f2e38b99e03217ca37b206db14cee079f2bf681eb4fe3
uritemplate==4.1.1
urllib3==2.2.1
uvicorn==0.29.0
uvloop==0.19.0
vllm==0.4.0.post1
wandb==0.16.6
wasabi==1.1.2
watchfiles==0.21.0
wcwidth==0.2.13
weasel==0.3.2
webdataset==0.1.62
webencodings==0.5.1
websockets==12.0
Werkzeug==3.0.0
wget==3.2
widgetsnbextension==4.0.10
wonderwords==2.2.0
wrapt==1.16.0
xdoctest==1.0.2
xformers==0.0.23.post1
xgboost @ file:///rapids/xgboost-1.7.5-cp310-cp310-linux_x86_64.whl#sha256=56f29fb999f8272bf8498ecbaf0659de4becf693b96a545f0e52f627270cf80d
xxhash==3.4.1
yarl @ file:///rapids/yarl-1.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=891c0e3ec5ec881541f6c5113d8df0315ce5440e244a716b95f2525b7b9f3608
youtokentome==1.0.6
zarr==2.17.2
zict @ file:///rapids/zict-3.0.0-py2.py3-none-any.whl#sha256=5796e36bd0e0cc8cf0fbc1ace6a68912611c1dbd74750a3f3026b9b9d6a327ae
zipp @ file:///rapids/zipp-3.16.2-py3-none-any.whl#sha256=679e51dd4403591b2d6838a48de3d283f3d188412a9782faadf845f298736ba0
zope.event==5.0
zope.interface==6.2
Great work on the benchmark. Before benchmarking a model continually pre-trained with Infini-Attention, I wanted to do some sanity checks on the benchmark on reproducibility.
Experimental setup:
The task parameters are unedited in the synthetic.yaml from the current main branch link
Results
Thus, making an average for 4k of 80.02% compared to the reported 85.6%
Thus, making an average for 4k of 86.62% compared to the reported 93.6%
Questions
This will also help me in reproducing the Phi-3 128k results, as I also got around 54% average for 4k. Thanks!