Closed MK096 closed 3 years ago
Have you tried the Colab notebook that we provided as a demonstration?
Have you tried the Colab notebook that we provided as a demonstration?
I went through it, but i couldn't figure out how to make it work on local machine. How to use those checkpoints on my local machine ?
@MK096 for your local machine, I recommend using the transformers library.
@MK096 for your local machine, I recommend using the transformers library.
I did but it didn't work. I guess it's because the files i downloaded from theeye.eye gave checkpoints file and to run a model I'd need encoder. Json, pyrorch-bin, etc... So how do I generate those files from the files I've downloaded https://github.com/EleutherAI/gpt-neo/issues/226#issue-919573517
@MK096 for your local machine, I recommend using the transformers library.
I did but it didn't work. I guess it's because the files i downloaded from theeye.eye gave checkpoints file and to run a model I'd need encoder. Json, pyrorch-bin, etc... So how do I generate those files from the files I've downloaded #226 (comment)
You do not need to use the files from the eye to use GPT-Neo via HuggingFace. You can directly download everything you need from the transformers package, as shown in the example code I linked to.
If you want to use the copy of the model on the eye, you can use the Google Colab notebook as a guide. If you’ve gotten it working on Colab you should be able to get it working locally, as it’s fundamentally the same thing.
@MK096 for your local machine, I recommend using the transformers library.
I did but it didn't work. I guess it's because the files i downloaded from theeye.eye gave checkpoints file and to run a model I'd need encoder. Json, pyrorch-bin, etc... So how do I generate those files from the files I've downloaded #226 (comment)
You do not need to use the files from the eye to use GPT-Neo via HuggingFace. You can directly download everything you need from the transformers package, as shown in the example code I linked to.
If you want to use the copy of the model on the eye, you can use the Google Colab notebook as a guide. If you’ve gotten it working on Colab you should be able to get it working locally, as it’s fundamentally the same thing.
I tried huggingface method but the problem was that after downloading 10-20% the downloading speed always reduces from 1 mbps to 5 kbps. Same thing happened when i was trying other huggingface models like opus-mt.
https://github.com/EleutherAI/gpt-neo/issues/219#issue-896609707
I’m really not sure what to say. When I do what you say you are doing I do not have any problems. I can run the model off the eye checkpoint and I can download it through HuggingFace. This appears to be a problem on your end.
Thanks for the help. I ran everything from scratch. Upon running (in cmd on local machine), now i get following error:
_C:\Users\Mayank\GPTNeo>python main.py --predict --prompt try.txt --model GPT3XL
2021-06-14 21:42:02.505836: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
WARNING:tensorflow:From C:\Users\Mayank\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\compat\v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Current step 362000
Saving config to C://Users//Mayank//GPTNeo//models//GPT3_XL
2021-06-14 21:42:15.288177: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-14 21:42:16.082196: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1d5842f8670 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-06-14 21:42:16.120514: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-06-14 21:42:16.154220: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2021-06-14 21:42:16.390683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: GeForce MX130 computeCapability: 5.0
coreClock: 1.189GHz coreCount: 3 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 37.33GiB/s
2021-06-14 21:42:16.438659: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-06-14 21:42:16.445569: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-06-14 21:42:16.455962: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-06-14 21:42:16.463522: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-06-14 21:42:16.474350: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-06-14 21:42:16.519344: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-06-14 21:42:16.530178: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-06-14 21:42:16.539556: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-06-14 21:42:16.553314: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-06-14 21:42:18.177633: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-14 21:42:18.188617: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0
2021-06-14 21:42:18.194467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N
2021-06-14 21:42:18.205021: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1369 MB memory) -> physical GPU (device: 0, name: GeForce MX130, pci bus id: 0000:02:00.0, compute capability: 5.0)
2021-06-14 21:42:18.322554: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1d5a2652220 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-06-14 21:42:18.342300: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce MX130, Compute Capability 5.0
2021-06-14 21:42:19.504241: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:196] None of the MLIR optimization passes are enabled (registered 0 passes)
Done!
params = defaultdict(<function fetch_model_params.
Great! You are now officially out of our hands... this is a GPU configuration issue on your end. I recommend checking out multiGPU training documentation for whatever framework you are using. This issue may also be helpful.
Hi, I downloaded gpt neo from theeye.eye on my pc. It downloaded various checkpoints. How do i use them? ... Because in order too load and use model I'd need encoder. Json, pytorch. Bin, etc..