Open obriensystems opened 9 months ago
i9-13900KS running dual RTX-A4500 (20+20G) Ampere and i9-14900K running dual RTX-4090 (24+24G) Ada
CPU first
C:/wse_github/llama.cpp $ ./main.exe -m g:/models/gemma-7b.gguf -p "what partion of gold is made in exploding stars" -n 2000 -e --color -t 24
Log start
main: build = 2234 (973053d8)
main: built with cc (GCC) 13.2.0 for x86_64-w64-mingw32
main: seed = 1708573388
llama_model_loader: loaded meta data with 19 key-value pairs and 254 tensors from g:/models/gemma-7b.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = gemma
llama_model_loader: - kv 1: general.name str = gemma-7b
llama_model_loader: - kv 2: gemma.context_length u32 = 8192
llama_model_loader: - kv 3: gemma.block_count u32 = 28
llama_model_loader: - kv 4: gemma.embedding_length u32 = 3072
llama_model_loader: - kv 5: gemma.feed_forward_length u32 = 24576
llama_model_loader: - kv 6: gemma.attention.head_count u32 = 16
llama_model_loader: - kv 7: gemma.attention.head_count_kv u32 = 16
llama_model_loader: - kv 8: gemma.attention.key_length u32 = 256
llama_model_loader: - kv 9: gemma.attention.value_length u32 = 256
llama_model_loader: - kv 10: gemma.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 11: tokenizer.ggml.model str = llama
llama_model_loader: - kv 12: tokenizer.ggml.bos_token_id u32 = 2
llama_model_loader: - kv 13: tokenizer.ggml.eos_token_id u32 = 1
llama_model_loader: - kv 14: tokenizer.ggml.padding_token_id u32 = 0
llama_model_loader: - kv 15: tokenizer.ggml.unknown_token_id u32 = 3
llama_model_loader: - kv 16: tokenizer.ggml.tokens arr[str,256128] = ["<pad>", "<eos>", "<bos>", "<unk>", ...
llama_model_loader: - kv 17: tokenizer.ggml.scores arr[f32,256128] = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv 18: tokenizer.ggml.token_type arr[i32,256128] = [3, 3, 3, 2, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - type f32: 254 tensors
llm_load_vocab: mismatch in special tokens definition ( 544/256128 vs 388/256128 ).
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = gemma
llm_load_print_meta: vocab type = SPM
llm_load_print_meta: n_vocab = 256128
llm_load_print_meta: n_merges = 0
llm_load_print_meta: n_ctx_train = 8192
llm_load_print_meta: n_embd = 3072
llm_load_print_meta: n_head = 16
llm_load_print_meta: n_head_kv = 16
llm_load_print_meta: n_layer = 28
llm_load_print_meta: n_rot = 192
llm_load_print_meta: n_embd_head_k = 256
llm_load_print_meta: n_embd_head_v = 256
llm_load_print_meta: n_gqa = 1
llm_load_print_meta: n_embd_k_gqa = 4096
llm_load_print_meta: n_embd_v_gqa = 4096
llm_load_print_meta: f_norm_eps = 0.0e+00
llm_load_print_meta: f_norm_rms_eps = 1.0e-06
llm_load_print_meta: f_clamp_kqv = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: n_ff = 24576
llm_load_print_meta: n_expert = 0
llm_load_print_meta: n_expert_used = 0
llm_load_print_meta: rope scaling = linear
llm_load_print_meta: freq_base_train = 10000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_yarn_orig_ctx = 8192
llm_load_print_meta: rope_finetuned = unknown
llm_load_print_meta: model type = 7B
llm_load_print_meta: model ftype = all F32 (guessed)
llm_load_print_meta: model params = 8.54 B
llm_load_print_meta: model size = 31.81 GiB (32.00 BPW)
llm_load_print_meta: general.name = gemma-7b
llm_load_print_meta: BOS token = 2 '<bos>'
llm_load_print_meta: EOS token = 1 '<eos>'
llm_load_print_meta: UNK token = 3 '<unk>'
llm_load_print_meta: PAD token = 0 '<pad>'
llm_load_print_meta: LF token = 227 '<0x0A>'
llm_load_tensors: ggml ctx size = 0.10 MiB
llm_load_tensors: CPU buffer size = 32570.17 MiB
......................................................................................
llama_new_context_with_model: n_ctx = 512
llama_new_context_with_model: freq_base = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: CPU KV buffer size = 224.00 MiB
llama_new_context_with_model: KV self size = 224.00 MiB, K (f16): 112.00 MiB, V (f16): 112.00 MiB
llama_new_context_with_model: CPU input buffer size = 8.01 MiB
llama_new_context_with_model: CPU compute buffer size = 506.25 MiB
llama_new_context_with_model: graph splits (measure): 1
system_info: n_threads = 24 / 32 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 |
sampling:
repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 512, n_batch = 512, n_predict = 2000, n_keep = 1
what partion of gold is made in exploding stars.
We have seen that there are very strong indications, both from meteoritic data and terrestrial rocks (see Chapter 8), that the Earth has experienced a bombardment by large bodies (planetesimals) during the first few million years after its formation $460 \times$ $\left(1-\delta_{\mathrm{E}}\right)$ ago. These planetensimals were largely made of silicates and iron, but they contained less metal than is found in meteorites today: their meteoritic equivalent was the enstatite chondrite with a low amount ( $9 \%$ by mass) metallic FeNi alloy; this type has been identified from both lunar rock fragments and martian samples. Thus these projectiles were largely made of silicates, but they contained 15\% metal compared to $\sim 30-42$ vol.\% for the present terrestrial core: therefore their impact on a young Earth must have produced an important amount (at least $~ \frac{8}{6}$ ) metallic FeNi alloy.
The total mass of these projectiles, which is estimated from both lunar rocks and martian samples to be $\sim 05$ M $_{E}$, implies that the fraction ejected into space by their impact was between a half $(1 /:)$ or two thirds (2/3) depending on the initial amount in FeNi alloy. Thus our Moon represents $7 \%$ of these projectiles, while Mars may have formed as much as one third $(\sim 4 \%)$. This last figure is to be compared with that found for planetary formation by dynamical processes which indicates a mass ratio between Earth and its moon at most equal
<h1>CHAPTER $\mathrm{X</h1>
to unity. The fraction ejected into space in FeNi alloy was therefore of the order $02(1 / 3)$, i e., it represents about half $(\sim \delta)$ the total amount of metallic iron, or an average abundance for meteoritic chondrites (see Table X-5). It is interesting to note that this fraction is very similar both to what we have estimated in Chapter VIII from present terrestrial core data and also from meteorites. Thus it appears more probable that a large part of the Earth's metallic FeNi alloy was formed by impact, rather than due solely or mainly because of differentiation during planetary formation (see Fig 10-2).
The other half $(\sim \delta)$ is probably made in stellar interiors where we have seen above the Moon. at as: to to to to to to to once more than $45 on both by by over Earth's surface,,,,,,,,,,,,,,,,, but but but when it came be expected from during its formation is is may indicate that purely theoretical reason
once. discretely and stimated to to to to to to to to in in in in in in in in in in in once more than 554 on on over Earth's a fine dust, mass the size of once they are small at any amount about $234 and as formed from above one- purely theoretical estimate is forこの日 twice its possible that hereto form $\quad$ formation. strictly speaking which was first very much larger in under an extremely large estimated fraction
on over Earth's slightly greater than a mass may have been the average once or at at any more of just barely bigger $0 within 1 to to amongst it seems highly about one another small amount formed during its possible origin for each and (a great deal less is not yet further our very much larger. Thus far above all $\quad$ strictly speaking as we still greater than most probably in the average while a mes quite big. once a mere theory ofRDONLY more... an an example, thereunder weight half that hereto form 1mipmap or on favourably huge new super- once andANNES ( over its structure size thatpelier one extremely heavy duty earth large beyond where it is almost double how very much larger than life itself the outside world above all but at (()one forividuated several times more slightly heavier.この日, thereunder $\left(((( ( ( ( ( ( ( ( ( ( ( ( ( once again highly probably a just half- purely because ofcreateServer size one and hereto form earth large beyond which had it is mesquite the almost double duty above all mass that within its structure weight very much larger than life itself outside world over twice more heavy slightly heavier. thereunder, (()one was formed somewhat smaller at $0 on once again highly probably greater in forming just about a half- once more heavily inside out whose outer size one greatly beyond which might be the almost double duty earth quite large above all mass formation weight within its structure very much larger than life itself outside world would have (somehow it is slightly heavier. The amount of discretely under(and somewhat formed on had twice more heavy that was in from $\quad$ some amount greater below average (that it is just occurred halfway through and once but not, meselfromore the release . [end of text]
llama_print_timings: load time = 4247.23 ms
llama_print_timings: sample time = 859.49 ms / 1030 runs ( 0.83 ms per token, 1198.38 tokens per second)
llama_print_timings: prompt eval time = 693.44 ms / 11 tokens ( 63.04 ms per token, 15.86 tokens per second)
llama_print_timings: eval time = 412311.19 ms / 1029 runs ( 400.69 ms per token, 2.50 tokens per second)
llama_print_timings: total time = 415226.57 ms / 1040 tokens
Log end
Team, thank you for integrating Gemma support into llama.cpp yesterday - this was an extremely fast and efficient alignment with a model that just came out a couple hours before. I personally am very grateful to your efforts. A wider community thank you is in order thank you for https://github.com/ggerganov/llama.cpp/pull/5631
investigate TensorFlow 2 / keras support https://developers.googleblog.com/2024/02/gemma-models-in-keras.html https://ai.google.dev/gemma/docs/distributed_tuning
pip install -U torch
pip install -U transformers
The model runs at 20% TDP or 100+100W because I am sharing the model across the PCIe bus at 8x which saturates it up to 75% at 8 x 2GBps or = 12GBps as opposed to NVlink on ampere cards at 112GBps
checking context length
Using the model-agnostic default max_length
(=20) to control the generation length
outputs = model.generate(**input_ids, max_new_tokens=1000)
from transformers import AutoTokenizer, AutoModelForCausalLM
access_token='hf_cfT...QqH'
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b", token=access_token)
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", device_map="auto", token=access_token)
input_text = "how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, max_new_tokens=3000)
print(tokenizer.decode(outputs[0]))
python pip summary
332 cd machine-learning/
335 mkdir gemma
337 vi gemma-cpu.py
339 pip install -U transformers
352 pip install -U torch
353 python gemma-cpu.py
355 nvcc --version
364 pip install accelerate
366 pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
368 python gemma-gpu.py
run
michael@13900b MINGW64 /c/wse_github/obrienlabsdev/machine-learning/gemma (main)
$ python gemma-gpu.py
Loading checkpoint shards: 100%|████████████████████████████████████| 4/4 [00:06<00:00, 1.72s/it]
C:\Users\michael\AppData\Roaming\Python\Python311\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, when a neutron star collapses, it undergoes a process called gravitational collapse, which causes the star to rapidly lose mass and density. This process releases a tremendous amount of energy, which can cause the star to explode in a supernova. During the supernova, the star's core undergoes a process called the r-process, which is responsible for creating heavy elements like gold. The r-process occurs when neutrons are added to atomic nuclei, causing them to become unstable and undergo beta decay. This process continues until the nucleus reaches a stable state, which is usually a heavy element like gold.
Step 2/2
The ratio of gold created during the r-process is not well understood, as it depends on a variety of factors, including the mass and density of the star, the amount of energy released during the supernova, and the specific conditions of the r-process. However, it is believed that the r-process is responsible for creating most of the heavy elements in the universe, including gold.<eos>
michael@14900c MINGW64 ~
$ cd /c/wse_github/ObrienlabsDev/machine-learning/
michael@14900c MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ pip install -U transformers
Collecting transformers
Downloading transformers-4.38.1-py3-none-any.whl.metadata (131 kB)
-------------------------------------- 131.1/131.1 kB 1.5 MB/s eta 0:00:00
Collecting filelock (from transformers)
Downloading filelock-3.13.1-py3-none-any.whl.metadata (2.8 kB)
Collecting huggingface-hub<1.0,>=0.19.3 (from transformers)
Downloading huggingface_hub-0.20.3-py3-none-any.whl.metadata (12 kB)
Collecting numpy>=1.17 (from transformers)
Downloading numpy-1.26.4-cp312-cp312-win_amd64.whl.metadata (61 kB)
---------------------------------------- 61.0/61.0 kB 3.4 MB/s eta 0:00:00
Collecting packaging>=20.0 (from transformers)
Downloading packaging-23.2-py3-none-any.whl.metadata (3.2 kB)
Collecting pyyaml>=5.1 (from transformers)
Downloading PyYAML-6.0.1-cp312-cp312-win_amd64.whl.metadata (2.1 kB)
Collecting regex!=2019.12.17 (from transformers)
Downloading regex-2023.12.25-cp312-cp312-win_amd64.whl.metadata (41 kB)
---------------------------------------- 42.0/42.0 kB ? eta 0:00:00
Collecting requests (from transformers)
Downloading requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting tokenizers<0.19,>=0.14 (from transformers)
Downloading tokenizers-0.15.2-cp312-none-win_amd64.whl.metadata (6.8 kB)
Collecting safetensors>=0.4.1 (from transformers)
Downloading safetensors-0.4.2-cp312-none-win_amd64.whl.metadata (3.9 kB)
Collecting tqdm>=4.27 (from transformers)
Downloading tqdm-4.66.2-py3-none-any.whl.metadata (57 kB)
---------------------------------------- 57.6/57.6 kB 3.2 MB/s eta 0:00:00
Collecting fsspec>=2023.5.0 (from huggingface-hub<1.0,>=0.19.3->transformers)
Downloading fsspec-2024.2.0-py3-none-any.whl.metadata (6.8 kB)
Collecting typing-extensions>=3.7.4.3 (from huggingface-hub<1.0,>=0.19.3->transformers)
Downloading typing_extensions-4.9.0-py3-none-any.whl.metadata (3.0 kB)
Collecting colorama (from tqdm>=4.27->transformers)
Downloading colorama-0.4.6-py2.py3-none-any.whl.metadata (17 kB)
Collecting charset-normalizer<4,>=2 (from requests->transformers)
Downloading charset_normalizer-3.3.2-cp312-cp312-win_amd64.whl.metadata (34 kB)
Collecting idna<4,>=2.5 (from requests->transformers)
Downloading idna-3.6-py3-none-any.whl.metadata (9.9 kB)
Collecting urllib3<3,>=1.21.1 (from requests->transformers)
Downloading urllib3-2.2.1-py3-none-any.whl.metadata (6.4 kB)
Collecting certifi>=2017.4.17 (from requests->transformers)
Downloading certifi-2024.2.2-py3-none-any.whl.metadata (2.2 kB)
Downloading transformers-4.38.1-py3-none-any.whl (8.5 MB)
---------------------------------------- 8.5/8.5 MB 10.7 MB/s eta 0:00:00
Downloading huggingface_hub-0.20.3-py3-none-any.whl (330 kB)
--------------------------------------- 330.1/330.1 kB 20.0 MB/s eta 0:00:00
Downloading numpy-1.26.4-cp312-cp312-win_amd64.whl (15.5 MB)
---------------------------------------- 15.5/15.5 MB 32.8 MB/s eta 0:00:00
Downloading packaging-23.2-py3-none-any.whl (53 kB)
---------------------------------------- 53.0/53.0 kB 2.7 MB/s eta 0:00:00
Downloading PyYAML-6.0.1-cp312-cp312-win_amd64.whl (138 kB)
---------------------------------------- 138.7/138.7 kB 8.0 MB/s eta 0:00:00
Downloading regex-2023.12.25-cp312-cp312-win_amd64.whl (268 kB)
---------------------------------------- 268.9/268.9 kB ? eta 0:00:00
Downloading safetensors-0.4.2-cp312-none-win_amd64.whl (270 kB)
---------------------------------------- 270.7/270.7 kB ? eta 0:00:00
Downloading tokenizers-0.15.2-cp312-none-win_amd64.whl (2.2 MB)
---------------------------------------- 2.2/2.2 MB 46.2 MB/s eta 0:00:00
Downloading tqdm-4.66.2-py3-none-any.whl (78 kB)
---------------------------------------- 78.3/78.3 kB 4.3 MB/s eta 0:00:00
Downloading filelock-3.13.1-py3-none-any.whl (11 kB)
Downloading requests-2.31.0-py3-none-any.whl (62 kB)
---------------------------------------- 62.6/62.6 kB ? eta 0:00:00
Downloading certifi-2024.2.2-py3-none-any.whl (163 kB)
---------------------------------------- 163.8/163.8 kB 9.6 MB/s eta 0:00:00
Downloading charset_normalizer-3.3.2-cp312-cp312-win_amd64.whl (100 kB)
---------------------------------------- 100.4/100.4 kB ? eta 0:00:00
Downloading fsspec-2024.2.0-py3-none-any.whl (170 kB)
---------------------------------------- 170.9/170.9 kB ? eta 0:00:00
Downloading idna-3.6-py3-none-any.whl (61 kB)
---------------------------------------- 61.6/61.6 kB 3.4 MB/s eta 0:00:00
Downloading typing_extensions-4.9.0-py3-none-any.whl (32 kB)
Downloading urllib3-2.2.1-py3-none-any.whl (121 kB)
---------------------------------------- 121.1/121.1 kB 6.9 MB/s eta 0:00:00
Downloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Installing collected packages: urllib3, typing-extensions, safetensors, regex, pyyaml, packaging, numpy, idna, fsspec, filelock, colorama, charset-normalizer, certifi, tqdm, requests, huggingface-hub, tokenizers, transformers
Successfully installed certifi-2024.2.2 charset-normalizer-3.3.2 colorama-0.4.6 filelock-3.13.1 fsspec-2024.2.0 huggingface-hub-0.20.3 idna-3.6 numpy-1.26.4 packaging-23.2 pyyaml-6.0.1 regex-2023.12.25 requests-2.31.0 safetensors-0.4.2 tokenizers-0.15.2 tqdm-4.66.2 transformers-4.38.1 typing-extensions-4.9.0 urllib3-2.2.1
michael@14900c MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ pip install -U torch
Collecting torch
Downloading torch-2.2.1-cp312-cp312-win_amd64.whl.metadata (26 kB)
Requirement already satisfied: filelock in c:\optpython312\lib\site-packages (from torch) (3.13.1)
Requirement already satisfied: typing-extensions>=4.8.0 in c:\optpython312\lib\site-packages (from torch) (4.9.0)
Collecting sympy (from torch)
Downloading sympy-1.12-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch)
Downloading networkx-3.2.1-py3-none-any.whl.metadata (5.2 kB)
Collecting jinja2 (from torch)
Downloading Jinja2-3.1.3-py3-none-any.whl.metadata (3.3 kB)
Requirement already satisfied: fsspec in c:\optpython312\lib\site-packages (from torch) (2024.2.0)
Collecting MarkupSafe>=2.0 (from jinja2->torch)
Downloading MarkupSafe-2.1.5-cp312-cp312-win_amd64.whl.metadata (3.1 kB)
Collecting mpmath>=0.19 (from sympy->torch)
Downloading mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)
Downloading torch-2.2.1-cp312-cp312-win_amd64.whl (198.5 MB)
--------------------------------------- 198.5/198.5 MB 46.9 MB/s eta 0:00:00
Downloading Jinja2-3.1.3-py3-none-any.whl (133 kB)
---------------------------------------- 133.2/133.2 kB ? eta 0:00:00
Downloading networkx-3.2.1-py3-none-any.whl (1.6 MB)
---------------------------------------- 1.6/1.6 MB 102.3 MB/s eta 0:00:00
Downloading sympy-1.12-py3-none-any.whl (5.7 MB)
---------------------------------------- 5.7/5.7 MB 122.0 MB/s eta 0:00:00
Downloading MarkupSafe-2.1.5-cp312-cp312-win_amd64.whl (17 kB)
Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
---------------------------------------- 536.2/536.2 kB ? eta 0:00:00
Installing collected packages: mpmath, sympy, networkx, MarkupSafe, jinja2, torch
Successfully installed MarkupSafe-2.1.5 jinja2-3.1.3 mpmath-1.3.0 networkx-3.2.1 sympy-1.12 torch-2.2.1
michael@14900c MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Nov__3_17:51:05_Pacific_Daylight_Time_2023
Cuda compilation tools, release 12.3, V12.3.103
Build cuda_12.3.r12.3/compiler.33492891_0
michael@14900c MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ pip install accelerate
Collecting accelerate
Downloading accelerate-0.27.2-py3-none-any.whl.metadata (18 kB)
Requirement already satisfied: numpy>=1.17 in c:\optpython312\lib\site-packages (from accelerate) (1.26.4)
Requirement already satisfied: packaging>=20.0 in c:\optpython312\lib\site-packages (from accelerate) (23.2)
Collecting psutil (from accelerate)
Downloading psutil-5.9.8-cp37-abi3-win_amd64.whl.metadata (22 kB)
Requirement already satisfied: pyyaml in c:\optpython312\lib\site-packages (from accelerate) (6.0.1)
Requirement already satisfied: torch>=1.10.0 in c:\optpython312\lib\site-packages (from accelerate) (2.2.1)
Requirement already satisfied: huggingface-hub in c:\optpython312\lib\site-packages (from accelerate) (0.20.3)
Requirement already satisfied: safetensors>=0.3.1 in c:\optpython312\lib\site-packages (from accelerate) (0.4.2)
Requirement already satisfied: filelock in c:\optpython312\lib\site-packages (from torch>=1.10.0->accelerate) (3.13.1)
Requirement already satisfied: typing-extensions>=4.8.0 in c:\optpython312\lib\site-packages (from torch>=1.10.0->accelerate) (4.9.0)
Requirement already satisfied: sympy in c:\optpython312\lib\site-packages (from torch>=1.10.0->accelerate) (1.12)
Requirement already satisfied: networkx in c:\optpython312\lib\site-packages (from torch>=1.10.0->accelerate) (3.2.1)
Requirement already satisfied: jinja2 in c:\optpython312\lib\site-packages (from torch>=1.10.0->accelerate) (3.1.3)
Requirement already satisfied: fsspec in c:\optpython312\lib\site-packages (from torch>=1.10.0->accelerate) (2024.2.0)
Requirement already satisfied: requests in c:\optpython312\lib\site-packages (from huggingface-hub->accelerate) (2.31.0)
Requirement already satisfied: tqdm>=4.42.1 in c:\optpython312\lib\site-packages (from huggingface-hub->accelerate) (4.66.2)
Requirement already satisfied: colorama in c:\optpython312\lib\site-packages (from tqdm>=4.42.1->huggingface-hub->accelerate) (0.4.6)
Requirement already satisfied: MarkupSafe>=2.0 in c:\optpython312\lib\site-packages (from jinja2->torch>=1.10.0->accelerate) (2.1.5)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\optpython312\lib\site-packages (from requests->huggingface-hub->accelerate) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\optpython312\lib\site-packages (from requests->huggingface-hub->accelerate) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\optpython312\lib\site-packages (from requests->huggingface-hub->accelerate) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in c:\optpython312\lib\site-packages (from requests->huggingface-hub->accelerate) (2024.2.2)
Requirement already satisfied: mpmath>=0.19 in c:\optpython312\lib\site-packages (from sympy->torch>=1.10.0->accelerate) (1.3.0)
Downloading accelerate-0.27.2-py3-none-any.whl (279 kB)
---------------------------------------- 280.0/280.0 kB 2.2 MB/s eta 0:00:00
Downloading psutil-5.9.8-cp37-abi3-win_amd64.whl (255 kB)
---------------------------------------- 255.1/255.1 kB 2.6 MB/s eta 0:00:00
Installing collected packages: psutil, accelerate
Successfully installed accelerate-0.27.2 psutil-5.9.8
michael@14900c MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 Looking in indexes: https://download.pytorch.org/whl/cu121
Requirement already satisfied: torch in c:\optpython312\lib\site-packages (2.2.1)
Collecting torchvision
Downloading https://download.pytorch.org/whl/cu121/torchvision-0.17.1%2Bcu121-cp312-cp312-win_amd64.whl (5.7 MB)
---------------------------------------- 5.7/5.7 MB 25.8 MB/s eta 0:00:00
Collecting torchaudio
Downloading https://download.pytorch.org/whl/cu121/torchaudio-2.2.1%2Bcu121-cp312-cp312-win_amd64.whl (4.0 MB)
---------------------------------------- 4.0/4.0 MB 87.8 MB/s eta 0:00:00
Requirement already satisfied: filelock in c:\optpython312\lib\site-packages (from torch) (3.13.1)
Requirement already satisfied: typing-extensions>=4.8.0 in c:\optpython312\lib\site-packages (from torch) (4.9.0)
Requirement already satisfied: sympy in c:\optpython312\lib\site-packages (from torch) (1.12)
Requirement already satisfied: networkx in c:\optpython312\lib\site-packages (from torch) (3.2.1)
Requirement already satisfied: jinja2 in c:\optpython312\lib\site-packages (from torch) (3.1.3)
Requirement already satisfied: fsspec in c:\optpython312\lib\site-packages (from torch) (2024.2.0)
Requirement already satisfied: numpy in c:\optpython312\lib\site-packages (from torchvision) (1.26.4)
Collecting torch
Downloading https://download.pytorch.org/whl/cu121/torch-2.2.1%2Bcu121-cp312-cp312-win_amd64.whl (2454.8 MB)
---------------------------------------- 2.5/2.5 GB 6.1 MB/s eta 0:00:00
Collecting pillow!=8.3.*,>=5.3.0 (from torchvision)
Downloading https://download.pytorch.org/whl/pillow-10.2.0-cp312-cp312-win_amd64.whl (2.6 MB)
---------------------------------------- 2.6/2.6 MB 84.2 MB/s eta 0:00:00
Requirement already satisfied: MarkupSafe>=2.0 in c:\optpython312\lib\site-packages (from jinja2->torch) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in c:\optpython312\lib\site-packages (from sympy->torch) (1.3.0)
Installing collected packages: pillow, torch, torchvision, torchaudio
Attempting uninstall: torch
Found existing installation: torch 2.2.1
Uninstalling torch-2.2.1:
Successfully uninstalled torch-2.2.1
Successfully installed pillow-10.2.0 torch-2.2.1+cu121 torchaudio-2.2.1+cu121 torchvision-0.17.1+cu121
michael@14900c MINGW64 /c/wse_github/ObrienlabsDev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
tokenizer_config.json: 100%|##########| 1.11k/1.11k [00:00<00:00, 2.21MB/s]
C:\optpython312\Lib\site-packages\huggingface_hub\file_download.py:149: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\michael\.cache\huggingface\hub\models--google--gemma-2b. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
tokenizer.model: 100%|##########| 4.24M/4.24M [00:00<00:00, 24.0MB/s]
tokenizer.json: 100%|##########| 17.5M/17.5M [00:00<00:00, 98.7MB/s]
special_tokens_map.json: 100%|##########| 555/555 [00:00<?, ?B/s]
config.json: 100%|##########| 627/627 [00:00<?, ?B/s]
model.safetensors.index.json: 100%|##########| 13.5k/13.5k [00:00<00:00, 27.0MB/s]
model-00001-of-00002.safetensors: 100%|##########| 4.95G/4.95G [00:44<00:00, 112MB/s]
model-00002-of-00002.safetensors: 100%|##########| 67.1M/67.1M [00:00<00:00, 114MB/s]
Downloading shards: 100%|##########| 2/2 [00:45<00:00, 22.52s/it]0:00<00:00, 115MB/s]
Loading checkpoint shards: 100%|##########| 2/2 [00:02<00:00, 1.05s/it]
generation_config.json: 100%|##########| 137/137 [00:00<?, ?B/s]
C:\optpython312\Lib\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
genarate start: 08:36:18
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, we need to understand what the beta and r process are. The beta process is a type of nuclear reaction that occurs in stars when a neutron is converted into a proton, releasing a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. The r process is a type of nuclear reaction that occurs in supernovae and neutron stars. It involves the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of elements heavier than iron that are not produced by the beta process. Now, let's consider how gold is made in these processes. In the beta process, gold is produced by the conversion of a neutron into a proton, followed by the emission of a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. However, the r process is responsible for the production of gold in particular. In supernovae and neutron stars, gold is produced by the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of gold in the universe.
Step 2/2
Therefore, the ratio of gold created during the beta and r process depends on the ratio of the number of neutrons to protons in the star. If the ratio is high, more gold will be produced by the beta process, and if the ratio is low, more gold will be produced by the r process. However, the exact ratio of gold created by these processes is not known, as it depends on the specific conditions of the star and the supernova or neutron star.<eos>
end 08:36:32
rerun
michael@14900c MINGW64 /c/wse_github/ObrienlabsDev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
Loading checkpoint shards: 100%|##########| 2/2 [00:01<00:00, 1.09it/s]
C:\optpython312\Lib\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
genarate start: 08:38:53
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, we need to understand what the beta and r process are. The beta process is a type of nuclear reaction that occurs in stars when a neutron is converted into a proton, releasing a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. The r process is a type of nuclear reaction that occurs in supernovae and neutron stars. It involves the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of elements heavier than iron that are not produced by the beta process. Now, let's consider how gold is made in these processes. In the beta process, gold is produced by the conversion of a neutron into a proton, followed by the emission of a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. However, the r process is responsible for the production of gold in particular. In supernovae and neutron stars, gold is produced by the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of gold in the universe.
Step 2/2
Therefore, the ratio of gold created during the beta and r process depends on the ratio of the number of neutrons to protons in the star. If the ratio is high, more gold will be produced by the beta process, and if the ratio is low, more gold will be produced by the r process. However, the exact ratio of gold created by these processes is not known, as it depends on the specific conditions of the star and the supernova or neutron star.<eos>
end 08:39:07
generate start: 18:48:21
end 18:48:35
micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
Loading checkpoint shards: 100%|##########| 2/2 [00:02<00:00, 1.07s/it]
C:\opt\python312\Lib\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
genarate start: 17:57:19
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, we need to understand what the beta and r process are. The beta process is a type of nuclear reaction that occurs in stars when a neutron is converted into a proton, releasing a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. The r process is a type of nuclear reaction that occurs in supernovae and neutron stars. It involves the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of elements heavier than iron that are not produced by the beta process. Now, let's consider how gold is made in these processes. In the beta process, gold is produced by the conversion of a neutron into a proton, followed by the emission of a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. However, the r process is responsible for the production of gold in particular. In supernovae and neutron stars, gold is produced by the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of gold in the universe.
Step 2/2
Therefore, the ratio of gold created during the beta and r process depends on the ratio of the number of neutrons to protons in the star. If the ratio is high, more gold will be produced by the beta process, and if the ratio is low, more gold will be produced by the r process. However, the exact ratio of gold created by these processes is not known, as it depends on the specific conditions of the star and the supernova or neutron star.<eos>
end 17:57:33
from transformers import AutoTokenizer, AutoModelForCausalLM
from datetime import datetime
access_token='hf_cfTP...XCQqH'
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b", token=access_token)
# GPU
#model = AutoModelForCausalLM.from_pretrained("google/gemma-2b", device_map="auto", token=access_token)
# CPU
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b",token=access_token)
input_text = "how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process."
time_start = datetime.now().strftime("%H:%M:%S")
print("genarate start: ", datetime.now().strftime("%H:%M:%S"))
# GPU
#input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
# CPU
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids,
max_new_tokens=10000)
print(tokenizer.decode(outputs[0]))
print("end", datetime.now().strftime("%H:%M:%S"))
time_end = datetime.now().strftime("%H:%M:%S")
micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
Loading checkpoint shards: 100%|##########| 2/2 [00:01<00:00, 1.24it/s]
genarate start: 17:48:58
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, we need to understand what the beta and r process are. The beta process is a type of nuclear reaction that occurs in stars when a neutron is converted into a proton, releasing a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. The r process is a type of nuclear reaction that occurs in supernovae and neutron stars. It involves the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of elements heavier than iron that are not produced by the beta process. Now, let's consider how gold is made in these processes. In the beta process, gold is produced by the conversion of a neutron into a proton, followed by the emission of a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. However, the r process is responsible for the production of gold in particular. In supernovae and neutron stars, gold is produced by the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of gold in the universe.
Step 2/2
Therefore, the ratio of gold created during the beta and r process depends on the ratio of the number of neutrons to protons in the star. If the ratio is high, more gold will be produced by the beta process, and if the ratio is low, more gold will be produced by the r process. However, the exact ratio of gold created by these processes is not known, as it depends on the specific conditions of the star and the supernova or neutron star.<eos>
end 17:50:40
generate srt: 18:57:24
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, when a neutron star collapses, it undergoes a process called gravitational collapse, which causes the star to rapidly lose mass and density. This process releases a tremendous amount of energy, which can cause the star to explode in a supernova. During the supernova, the star's core undergoes a process called the r-process, which is responsible for creating heavy elements like gold. The r-process occurs when neutrons are added to atomic nuclei, causing them to become unstable and undergo beta decay. This process continues until the nucleus reaches a stable state, which is usually a heavy element like gold.
Step 2/2
The ratio of gold created during the r-process is not well understood, as it depends on a variety of factors, including the mass and density of the star, the amount of energy released during the supernova, and the specific conditions of the r-process. However, it is believed that the r-process is responsible for creating most of the heavy elements in the universe, including gold.<eos>
generate end: 19:00:11
cached model on
C:\Users\michael\.cache\huggingface\hub\models--google--gemma-7b\blobs
Running on G2
--machine-type=g2-standard-24
--accelerator=count=2,type=nvidia-l4-vws
--image=projects/nvidia-vgpu-public/global/images/nv-windows-server-2022-vws-536-25-v202306270722
60/69 saturation
PS C:\Users\michael> Invoke-WebRequest -Uri "https://www.python.org/ftp/python/3.10.2/python-3.10.2-amd64.exe" -OutFile "python-3.10.2-amd64.exe"
PS C:\Users\michael> .\python-3.10.2-amd64.exe /quiet InstallAllUsers=1 PrependPath=1 Include_test=0
PS C:\Users\michael> python --version
PS C:\Users\michael> nvidia-smi
Sun Feb 25 21:52:51 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 536.25 Driver Version: 536.25 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA L4 WDDM | 00000000:00:03.0 Off | 0 |
| N/A 48C P8 14W / 72W | 237MiB / 23034MiB | 2% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA L4 WDDM | 00000000:00:04.0 Off | 0 |
| N/A 45C P8 12W / 72W | 0MiB / 23034MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Finops 24 vCPU + 96 GB memory | $640.84 |
---|---|
2 NVIDIA L4 | $814.98 |
NVIDIA GRID license fee | $292.00 |
Premium image usage fee* | Unknown |
50 GB balanced persistent disk | $5.50 |
Total | $1,753.33 |
Finish installing software - or just use a docker container
4 pip install -U transformers
5 pip install -U torch
6 pip install accelerate
7 pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
Download code https://github.com/ObrienlabsDev/machine-learning https://github.com/ObrienlabsDev/machine-learning/archive/refs/heads/main.zip
extract zip, add hugging face token
PS C:\Windows\system32> cd C:\wse_github\machine-learning\environments\windows\src\google-gemma\
PS C:\wse_github\machine-learning\environments\windows\src\google-gemma> dir
Directory: C:\wse_github\machine-learning\environments\windows\src\google-gemma
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 2/25/2024 10:04 PM 861 gemma-gpu.py
Run the model - download it first at 3.5 Gbps
gpu is throttled by either NVlink or straight PCIe
PS C:\Windows\system32> cd C:\wse_github\machine-learning\environments\windows\src\google-gemma\
PS C:\wse_github\machine-learning\environments\windows\src\google-gemma> dir
Directory: C:\wse_github\machine-learning\environments\windows\src\google-gemma
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 2/25/2024 10:04 PM 861 gemma-gpu.py
PS C:\Users\michael> nvidia-smi
Sun Feb 25 22:09:50 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 536.25 Driver Version: 536.25 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA L4 WDDM | 00000000:00:03.0 Off | 0 |
| N/A 64C P0 38W / 72W | 19706MiB / 23034MiB | 62% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA L4 WDDM | 00000000:00:04.0 Off | 0 |
| N/A 62C P0 37W / 72W | 16312MiB / 23034MiB | 70% Default |
| | | N/A |
results: 3:22 at 50% GPU saturation
genarate start: 22:07:46
C:\Program Files\Python310\lib\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, when a neutron star collapses, it undergoes a process called gravitational collapse, which causes the star to rapidly lose mass and density. This process releases a tremendous amount of energy, which can cause the star to explode in a supernova. During the supernova, the star's core undergoes a process called the r-process, which is responsible for creating heavy elements like gold. The r-process occurs when neutrons are added to atomic nuclei, causing them to become unstable and undergo beta decay. This process continues until the nucleus reaches a stable state, which is usually a heavy element like gold.
Step 2/2
The ratio of gold created during the r-process is not well understood, as it depends on a variety of factors, including the mass and density of the star, the amount of energy released during the supernova, and the specific conditions of the r-process. However, it is believed that the r-process is responsible for creating most of the heavy elements in the universe, including gold.<eos>
end 22:11:08
2nd run
0:60/1:70%
genarate start: 22:21:05
C:\Program Files\Python310\lib\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, when a neutron star collapses, it undergoes a process called gravitational collapse, which causes the star to rapidly lose mass and density. This process releases a tremendous amount of energy, which can cause the star to explode in a supernova. During the supernova, the star's core undergoes a process called the r-process, which is responsible for creating heavy elements like gold. The r-process occurs when neutrons are added to atomic nuclei, causing them to become unstable and undergo beta decay. This process continues until the nucleus reaches a stable state, which is usually a heavy element like gold.
Step 2/2
The ratio of gold created during the r-process is not well understood, as it depends on a variety of factors, including the mass and density of the star, the amount of energy released during the supernova, and the specific conditions of the r-process. However, it is believed that the r-process is responsible for creating most of the heavy elements in the universe, including gold.<eos>
end 22:24:33
PS C:\Users\michael> nvidia-smi
Sun Feb 25 22:27:40 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 536.25 Driver Version: 536.25 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA L4 WDDM | 00000000:00:03.0 Off | 0 |
| N/A 65C P0 38W / 72W | 7661MiB / 23034MiB | 67% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA L4 WDDM | 00000000:00:04.0 Off | 0 |
| N/A 61C P0 34W / 72W | 5114MiB / 23034MiB | 68% Default |
| | | N/A |
genarate start: 22:27:26
C:\Program Files\Python310\lib\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, we need to understand what the beta and r process are. The beta process is a type of nuclear reaction that occurs in stars when a neutron is converted into a proton, releasing a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. The r process is a type of nuclear reaction that occurs in supernovae and neutron stars. It involves the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of elements heavier than iron that are not produced by the beta process. Now, let's consider how gold is made in these processes. In the beta process, gold is produced by the conversion of a neutron into a proton, followed by the emission of a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. However, the r process is responsible for the production of gold in particular. In supernovae and neutron stars, gold is produced by the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of gold in the universe.
Step 2/2
Therefore, the ratio of gold created during the beta and r process depends on the ratio of the number of neutrons to protons in the star. If the ratio is high, more gold will be produced by the beta process, and if the ratio is low, more gold will be produced by the r process. However, the exact ratio of gold created by these processes is not known, as it depends on the specific conditions of the star and the supernova or neutron star.<eos>
end 22:31:23
however running without NVidia grid as below
gcloud compute instances create nvidia-rtx-virtual-workstation-window-7-vm-20240225-215824 --project=cuda-old --zone=us-east4-a --machine-type=g2-standard-24 --network-interface=network-tier=PREMIUM,stack-type=IPV4_ONLY,subnet=default --metadata=^,@^google-monitoring-enable=0,@google-logging-enable=0,@windows-keys=\{\"expireOn\":\"2023-08-12T00:35:23.193242Z\",\"userName\":\"michael\",\"email\":\"mic...ienlabs.dev\",\"modulus\":\"k7R8sljAONIAZoMUuQ\+/KR7\+BH03q52QYhYT8yDWM4tAcveUC\+xjPhQ/LRhQG1GPY/yIOXp1zWKF7V87v0Ffi1xTUghkctLXXRRuqUjqC3L2JSuB7eHYijDfk5XUkaIoZq\+VMjHRBo7bw2dq3JSs0Czfv/BhNzGPrd0tI/UoBIFt7CZ3oxwqC5b5w0NAL9NdqD1LkEmqN56aMbVd9f9rnmEFlENySRbZXIeq61MT9qnDkfMm6Iq0eMY3g8vBYSplYGxbCxETOIvAU/5uh5gkjupX9A01O9DtJpTHoN98X6QHtED8xgYrwneMbtRvwgdRjNzFnH5mL4j95ZZprrEe3Q==\",\"exponent\":\"AQAB\"\} --maintenance-policy=TERMINATE --provisioning-model=STANDARD --service-account=196717963363-compute@developer.gserviceaccount.com --scopes=https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/monitoring.write,https://www.googleapis.com/auth/devstorage.read_only --accelerator=count=2,type=nvidia-l4-vws --tags=nvidia-rtx-virtual-workstation-window-7-deployment --create-disk=auto-delete=yes,boot=yes,device-name=autogen-vm-tmpl-boot-disk,image=projects/nvidia-vgpu-public/global/images/nv-windows-server-2022-vws-536-25-v202306270722,mode=rw,size=50,type=projects/cuda-old/zones/us-east4-a/diskTypes/pd-balanced --no-shielded-secure-boot --shielded-vtpm --shielded-integrity-monitoring --labels=goog-dm=nvidia-rtx-virtual-workstation-window-7,goog-ec-src=vm_add-gcloud --reservation-affinity=any
Single L4
PS C:\wse_github\machine-learning\environments\windows\src\google-gemma> $Env:CUDA_VISIBLE_DEVICES = 0
PS C:\wse_github\machine-learning\environments\windows\src\google-gemma> python gemma-gpu.py
faster: 0:24 for the gemma 2B model that fits in 25G
genarate start: 22:51:35
C:\Program Files\Python310\lib\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, we need to understand what the beta and r process are. The beta process is a type of nuclear reaction that occurs in stars when a neutron is converted into a proton, releasing a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. The r process is a type of nuclear reaction that occurs in supernovae and neutron stars. It involves the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of elements heavier than iron that are not produced by the beta process. Now, let's consider how gold is made in these processes. In the beta process, gold is produced by the conversion of a neutron into a proton, followed by the emission of a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. However, the r process is responsible for the production of gold in particular. In supernovae and neutron stars, gold is produced by the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of gold in the universe.
Step 2/2
Therefore, the ratio of gold created during the beta and r process depends on the ratio of the number of neutrons to protons in the star. If the ratio is high, more gold will be produced by the beta process, and if the ratio is low, more gold will be produced by the r process. However, the exact ratio of gold created by these processes is not known, as it depends on the specific conditions of the star and the supernova or neutron star.<eos>
end 22:51:59
PS C:\Users\michael> nvidia-smi
Sun Feb 25 22:51:54 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 536.25 Driver Version: 536.25 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA L4 WDDM | 00000000:00:03.0 Off | 0 |
| N/A 78C P0 72W / 72W | 12725MiB / 23034MiB | 91% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA L4 WDDM | 00000000:00:04.0 Off | 0 |
| N/A 55C P8 12W / 72W | 0MiB / 23034MiB | 0% Default |
| | | N/A |
selecting devices in code
import os
#os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
RTX-4090 single Gemma 2B
$ python gemma-gpu.py
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:02<00:00, 1.04s/it]
genarate start: 23:33:15
C:\Users\michael\AppData\Roaming\Python\Python311\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, we need to understand what the beta and r process are. The beta process is a type of nuclear reaction that occurs in stars when a neutron is converted into a proton, releasing a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. The r process is a type of nuclear reaction that occurs in supernovae and neutron stars. It involves the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of elements heavier than iron that are not produced by the beta process. Now, let's consider how gold is made in these processes. In the beta process, gold is produced by the conversion of a neutron into a proton, followed by the emission of a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. However, the r process is responsible for the production of gold in particular. In supernovae and neutron stars, gold is produced by the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of gold in the universe.
Step 2/2
Therefore, the ratio of gold created during the beta and r process depends on the ratio of the number of neutrons to protons in the star. If the ratio is high, more gold will be produced by the beta process, and if the ratio is low, more gold will be produced by the r process. However, the exact ratio of gold created by these processes is not known, as it depends on the specific conditions of the star and the supernova or neutron star.<eos>
end 23:33:23
RTX-A4500 single Gemma 2B
michael@13900d MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.34s/it]
genarate start: 23:36:11
C:\opt\Python310\lib\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, we need to understand what the beta and r process are. The beta process is a type of nuclear reaction that occurs in stars when a neutron is converted into a proton, releasing a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. The r process is a type of nuclear reaction that occurs in supernovae and neutron stars. It involves the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of elements heavier than iron that are not produced by the beta process. Now, let's consider how gold is made in these processes. In the beta process, gold is produced by the conversion of a neutron into a proton, followed by the emission of a positron and an electron neutrino. This process is responsible for the production of most of the elements heavier than iron in the universe. However, the r process is responsible for the production of gold in particular. In supernovae and neutron stars, gold is produced by the capture of a neutron by a nucleus, followed by the emission of a proton and an electron neutrino. This process is responsible for the production of gold in the universe.
Step 2/2
Therefore, the ratio of gold created during the beta and r process depends on the ratio of the number of neutrons to protons in the star. If the ratio is high, more gold will be produced by the beta process, and if the ratio is low, more gold will be produced by the r process. However, the exact ratio of gold created by these processes is not known, as it depends on the specific conditions of the star and the supernova or neutron star.<eos>
end 23:36:21
No H100, A100 80/40 but V100 with 32G are available in amsterdam
$2.11 hourly
gcloud compute instances create instance-20240227-021806 --project=cuda-old --zone=europe-west4-a --machine-type=n1-standard-8 --network-interface=network-tier=PREMIUM,stack-type=IPV4_ONLY,subnet=default --maintenance-policy=TERMINATE --provisioning-model=STANDARD --service-account=196717963363-compute@developer.gserviceaccount.com --scopes=https://www.googleapis.com/auth/cloud-platform --accelerator=count=1,type=nvidia-tesla-v100 --tags=http-server,https-server --create-disk=auto-delete=yes,boot=yes,device-name=instance-20240227-021806,image=projects/ml-images/global/images/c0-deeplearning-common-cu113-v20230925-debian-10,mode=rw,size=200,type=projects/cuda-old/zones/europe-west4-a/diskTypes/pd-balanced --no-shielded-secure-boot --shielded-vtpm --shielded-integrity-monitoring --labels=goog-ec-src=vm_add-gcloud --reservation-affinity=any
======================================
Welcome to the Google Deep Learning VM
======================================
Version: common-cu113.m112
Based on: Debian GNU/Linux 10 (buster) (GNU/Linux 4.19.0-25-cloud-amd64 x86_64\n)
Resources:
* Google Deep Learning Platform StackOverflow: https://stackoverflow.com/questions/tagged/google-dl-platform
* Google Cloud Documentation: https://cloud.google.com/deep-learning-vm
* Google Group: https://groups.google.com/forum/#!forum/google-dl-platform
To reinstall Nvidia driver (if needed) run:
sudo /opt/deeplearning/install-driver.sh
Linux instance-20240227-v100cuda32 4.19.0-25-cloud-amd64 #1 SMP Debian 4.19.289-2 (2023-08-08) x86_64
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
This VM requires Nvidia drivers to function correctly. Installation takes ~1 minute.
Would you like to install the Nvidia driver? [y/n] y
Waiting for security updates to finish...-Installing Nvidia driver.
+ main
+ wait_apt_locks_released
+ echo 'wait apt locks released'
wait apt locks released
+ sudo fuser /var/lib/dpkg/lock /var/lib/apt/lists/lock /var/cache/apt/archives/lock
+ sudo fuser /var/lib/dpkg/lock-frontend
+ install_linux_headers
++ uname -r
+ echo 'install linux headers: linux-headers-4.19.0-25-cloud-amd64'
install linux headers: linux-headers-4.19.0-25-cloud-amd64
++ uname -r
+ sudo apt-get -o DPkg::Lock::Timeout=120 install -y linux-headers-4.19.0-25-cloud-amd64
Reading package lists... Done
Building dependency tree
Reading state information... Done
linux-headers-4.19.0-25-cloud-amd64 is already the newest version (4.19.289-2).
0 upgraded, 0 newly installed, 0 to remove and 11 not upgraded.
+ source /opt/deeplearning/driver-version.sh
++ export DRIVER_VERSION=510.47.03
++ DRIVER_VERSION=510.47.03
++ export DRIVER_UBUNTU_DEB=nvidia-driver-local-repo-ubuntu1804-510.47.03_1.0-1_amd64.deb
++ DRIVER_UBUNTU_DEB=nvidia-driver-local-repo-ubuntu1804-510.47.03_1.0-1_amd64.deb
++ export DRIVER_UBUNTU_CUDA_VERSION=11.3.1
++ DRIVER_UBUNTU_CUDA_VERSION=11.3.1
++ export DRIVER_UBUNTU_PKG=nvidia-driver-510
++ DRIVER_UBUNTU_PKG=nvidia-driver-510
+ export DRIVER_GCS_PATH
++ get_attribute_value nvidia-driver-gcs-path
++ get_metadata_value instance/attributes/nvidia-driver-gcs-path
++ curl --retry 5 -s -f -H 'Metadata-Flavor: Google' http://metadata/computeMetadata/v1/instance/attributes/nvidia-driver-gcs-path
+ DRIVER_GCS_PATH=
+ install_nvidia_linux_drivers
+ echo 'DRIVER_VERSION: 510.47.03'
DRIVER_VERSION: 510.47.03
+ local driver_installer_file_name=driver_installer.run
+ local nvidia_driver_file_name=NVIDIA-Linux-x86_64-510.47.03.run
+ custom_driver=false
+ local driver_gcs_file_path
+ [[ -z '' ]]
+ DRIVER_GCS_PATH=gs://nvidia-drivers-us-public/tesla/510.47.03
+ driver_gcs_file_path=gs://nvidia-drivers-us-public/tesla/510.47.03/NVIDIA-Linux-x86_64-510.47.03.run
+ echo 'Downloading driver from GCS location and install: gs://nvidia-drivers-us-public/tesla/510.47.03/NVIDIA-Linux-x86_64-510.47.03.run'
Downloading driver from GCS location and install: gs://nvidia-drivers-us-public/tesla/510.47.03/NVIDIA-Linux-x86_64-510.47.03.run
+ set +e
+ gsutil -q cp gs://nvidia-drivers-us-public/tesla/510.47.03/NVIDIA-Linux-x86_64-510.47.03.run driver_installer.run
+ set -e
+ [[ ! -f driver_installer.run ]]
+ [[ ! -f driver_installer.run ]]
+ local open_kernel_module_arg=-m=kernel-open
+ IFS=.
+ read -r major minor patch
++ get_metadata_value instance/machine-type
++ curl --retry 5 -s -f -H 'Metadata-Flavor: Google' http://metadata/computeMetadata/v1/instance/machine-type
+ local -r machine_type_full=projects/196717963363/machineTypes/n1-standard-8
+ local machine_type=n1-standard-8
+ [[ 510 -lt 525 ]]
+ open_kernel_module_arg=
+ chmod +x driver_installer.run
+ sudo ./driver_installer.run --dkms -a -s --no-drm --install-libglvnd ''
Verifying archive integrity... OK
Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 510.47.03..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
WARNING: The nvidia-drm module will not be installed. As a result, DRM-KMS will not function with this
installation of the NVIDIA driver.
WARNING: nvidia-installer was forced to guess the X library path '/usr/lib64' and X module path
'/usr/lib64/xorg/modules'; these paths were not queryable from the system. If X fails to find the
NVIDIA X driver module, please install the `pkg-config` utility and the X.Org SDK/development package
for your distribution and reinstall the driver.
+ rm -rf driver_installer.run
+ exit 0
Nvidia driver installed.
The V100 is only 16G - less than the L4 at 24G - not the expected 32G but it is 300W active cooled - instead of 75W passive cooled
(base) michael@instance-20240227-v100cuda32:~$ nvidia-smi
Tue Feb 27 02:34:23 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... Off | 00000000:00:04.0 Off | 0 |
| N/A 32C P0 38W / 300W | 0MiB / 16384MiB | 3% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
(base) michael@instance-20240227-v100cuda32:~$ python --version
Python 3.7.12
install libraries
Run Google Gemma 2B from the Hugging Face repo
First clone this repo https://github.com/ObrienlabsDev/machine-learning.git
nvidia-smi
pip install -U transformers
pip install -U torch
pip install accelerate
git clone https://github.com/ObrienlabsDev/machine-learning.git
cd machine-learning/
cd environments/windows/src/google-gemma/
Fix the hugging face token first - use yours
or add (did not need on windows L4 image - just linux V100
from huggingface_hub import login
login()
and login on the fly
(base) michael@instance-20240227-v100cuda32:~/machine-learning/environments/windows/src/google-gemma$ python gemma-gpu.py
_| _| _| _| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _|_|_|_| _|_| _|_|_| _|_|_|_|
_| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|
_|_|_|_| _| _| _| _|_| _| _|_| _| _| _| _| _| _|_| _|_|_| _|_|_|_| _| _|_|_|
_| _| _| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|
_| _| _|_| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _| _| _| _|_|_| _|_|_|_|
A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
Setting a new token will erase the existing one.
To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token:
Add token as git credential? (Y/n) n
Token is valid (permission: read).
Your token has been saved to /home/michael/.cache/huggingface/token
Login successful
Traceback (most recent call last):
File "gemma-gpu.py", line 18, in <module>
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b")#, token=access_token)
File "/opt/conda/lib/python3.7/site-packages/transformers/models/auto/tokenization_auto.py", line 689, in from_pretrained
f"Tokenizer class {tokenizer_class_candidate} does not exist or is not currently imported."
ValueError: Tokenizer class GemmaTokenizer does not exist or is not currently imported.
use interpreter
base) michael@instance-20240227-v100cuda32:~/machine-learning/environments/windows/src/google-gemma$ python
Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:21)
>>> from huggingface_hub import login
>>> login()
_| _| _| _| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _|_|_|_| _|_| _|_|_| _|_|_|_|
_| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|
_|_|_|_| _| _| _| _|_| _| _|_| _| _| _| _| _| _|_| _|_|_| _|_|_|_| _| _|_|_|
_| _| _| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|
_| _| _|_| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _| _| _| _|_|_| _|_|_|_|
A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
Setting a new token will erase the existing one.
To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token:
Add token as git credential? (Y/n) n
Token is valid (permission: read).
Your token has been saved to /home/michael/.cache/huggingface/token
Login successful
>>>
to fix
(base) michael@instance-20240227-v100cuda32:~/machine-learning/environments/windows/src/google-gemma$ python gemma-gpu.py
Traceback (most recent call last):
File "gemma-gpu.py", line 18, in <module>
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b", token=access_token)
File "/opt/conda/lib/python3.7/site-packages/transformers/models/auto/tokenization_auto.py", line 689, in from_pretrained
f"Tokenizer class {tokenizer_class_candidate} does not exist or is not currently imported."
ValueError: Tokenizer class GemmaTokenizer does not exist or is not currently imported.
pip install -U transformers
does not fix
switching to use_auth_token=True
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b", use_auth_token=True) #token=access_token)
# GPU
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b", device_map="auto", use_auth_token=True) #token=access_token)
trying from (16h ago)
!pip -q install git+https://github.com/huggingface/transformers.git
(base) michael@instance-20240227-v100cuda32:~/machine-learning/environments/windows/src/google-gemma$ !pip -q install git+https://github.com/huggingface/transformers.git
pip install -U transformers -q install git+https://github.com/huggingface/transformers.git
ERROR: Ignored the following versions that require a different python version: 0.17.0 Requires-Python >=3.8.0; 0.17.0rc0 Requires-Python >=3.8.0; 0.17.1 Requires-Python >=3.8.0; 0.17.2 Requires-Python >=3.8.0; 0.17.3 Requires-Python >=3.8.0; 0.18.0 Requires-Python >=3.8.0; 0.18.0rc0 Requires-Python >=3.8.0; 0.19.0 Requires-Python >=3.8.0; 0.19.0rc0 Requires-Python >=3.8.0; 0.19.1 Requires-Python >=3.8.0; 0.19.2 Requires-Python >=3.8.0; 0.19.3 Requires-Python >=3.8.0; 0.19.4 Requires-Python >=3.8.0; 0.20.0 Requires-Python >=3.8.0; 0.20.0rc0 Requires-Python >=3.8.0; 0.20.0rc1 Requires-Python >=3.8.0; 0.20.1 Requires-Python >=3.8.0; 0.20.2 Requires-Python >=3.8.0; 0.20.3 Requires-Python >=3.8.0; 4.31.0 Requires-Python >=3.8.0; 4.32.0 Requires-Python >=3.8.0; 4.32.1 Requires-Python >=3.8.0; 4.33.0 Requires-Python >=3.8.0; 4.33.1 Requires-Python >=3.8.0; 4.33.2 Requires-Python >=3.8.0; 4.33.3 Requires-Python >=3.8.0; 4.34.0 Requires-Python >=3.8.0; 4.34.1 Requires-Python >=3.8.0; 4.35.0 Requires-Python >=3.8.0; 4.35.1 Requires-Python >=3.8.0; 4.35.2 Requires-Python >=3.8.0; 4.36.0 Requires-Python >=3.8.0; 4.36.1 Requires-Python >=3.8.0; 4.36.2 Requires-Python >=3.8.0; 4.37.0 Requires-Python >=3.8.0; 4.37.1 Requires-Python >=3.8.0; 4.37.2 Requires-Python >=3.8.0; 4.38.0 Requires-Python >=3.8.0; 4.38.1 Requires-Python >=3.8.0
ERROR: Could not find a version that satisfies the requirement huggingface-hub<1.0,>=0.19.3 (from transformers) (from versions: 0.0.1, 0.0.2, 0.0.3rc1, 0.0.3rc2, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.14, 0.0.15, 0.0.16, 0.0.17, 0.0.18, 0.0.19, 0.1.0, 0.1.1, 0.1.2, 0.2.0, 0.2.1, 0.4.0, 0.5.0, 0.5.1, 0.6.0rc0, 0.6.0, 0.7.0rc0, 0.7.0, 0.8.0rc0, 0.8.0rc1, 0.8.0rc2, 0.8.0rc3, 0.8.0rc4, 0.8.0, 0.8.1, 0.9.0.dev0, 0.9.0rc0, 0.9.0rc2, 0.9.0rc3, 0.9.0, 0.9.1, 0.10.0rc0, 0.10.0rc1, 0.10.0rc3, 0.10.0, 0.10.1, 0.11.0rc0, 0.11.0rc1, 0.11.0, 0.11.1, 0.12.0rc0, 0.12.0, 0.12.1, 0.13.0rc0, 0.13.0rc1, 0.13.0, 0.13.1, 0.13.2, 0.13.3, 0.13.4, 0.14.0rc0, 0.14.0rc1, 0.14.0, 0.14.1, 0.15.0rc0, 0.15.0, 0.15.1, 0.16.0rc0, 0.16.1, 0.16.2, 0.16.3, 0.16.4)
ERROR: No matching distribution found for huggingface-hub<1.0,>=0.19.3
Issue is I need python 3.11 - running 3.7
(base) michael@instance-20240227-v100cuda32:~/machine-learning/environments/windows/src/google-gemma$ python --version
Python 3.7.12
installing python 3.12
(base) michael@instance-20240227-v100cuda32:~/machine-learning/environments/windows/src/google-gemma$ sudo apt install python 3.12
(base) michael@instance-20240227-v100cuda32:~/machine-learning/environments/windows/src/google-gemma$ sudo apt-get install --only-upgrade python3
Reading package lists... Done
Building dependency tree
Reading state information... Done
python3 is already the newest version (3.7.3-1).
0 upgraded, 0 newly installed, 0 to remove and 12 not upgraded.
Deep Learning VM with CUDA 12.1 M116
Debian 11, Python 3.10. With CUDA 12.1 preinstalled.
gcloud compute instances create instance-2024022-v100b16 --project=cuda-old --zone=europe-west4-a --machine-type=n1-standard-8 --network-interface=network-tier=PREMIUM,stack-type=IPV4_ONLY,subnet=default --maintenance-policy=TERMINATE --provisioning-model=STANDARD --service-account=196717963363-compute@developer.gserviceaccount.com --scopes=https://www.googleapis.com/auth/cloud-platform --accelerator=count=1,type=nvidia-tesla-v100 --tags=http-server,https-server --create-disk=auto-delete=yes,boot=yes,device-name=instance-2024022-v100b16,image=projects/ml-images/global/images/c0-deeplearning-common-cu121-v20240128-debian-11-py310,mode=rw,size=200,type=projects/cuda-old/zones/europe-west4-a/diskTypes/pd-balanced --no-shielded-secure-boot --shielded-vtpm --shielded-integrity-monitoring --labels=goog-ec-src=vm_add-gcloud --reservation-affinity=any
we will see later if we are OK with 3.10 and not 3.11
(base) michael@instance-2024022-v100b16:~$ python3 --version
Python 3.10.13
Gemma on Vertex AI Model Garden https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335
michael@14900c MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ ./build.sh
2024/03/26 22:30:47 http2: server: error reading preface from client //./pipe/docker_engine: file has already been closed
#0 building with "default" instance using docker driver
#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.0s
#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 485B done
#2 DONE 0.0s
#3 [internal] load metadata for docker.io/tensorflow/tensorflow:latest-gpu
#3 DONE 0.5s
#4 [1/3] FROM docker.io/tensorflow/tensorflow:latest-gpu@sha256:4ab9ffddd6ffacc9251ac6439f431eb38d66200d3f52397b5d77f9bc3298c4e9
#4 DONE 0.0s
#5 [internal] load build context
#5 transferring context: 57B done
#5 DONE 0.0s
#6 [2/3] WORKDIR /src
#6 CACHED
#7 [3/3] COPY /src/tflow.py .
#7 CACHED
#8 exporting to image
#8 exporting layers done
#8 writing image sha256:8ed644da2ebba91f78a1a769325adc43c153536365b2aa857da1a7628136faeb done
#8 naming to docker.io/library/ml-tensorflow-win done
#8 DONE 0.0s
What's Next?
1. Sign in to your Docker account → docker login
2. View a summary of image vulnerabilities and recommendations → docker scout quickview
2024-03-27 02:30:48.886114: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-03-27 02:30:48.920618: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-27 02:30:49.840948: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-27 02:30:49.844225: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-27 02:30:49.844270: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-27 02:30:49.848333: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-27 02:30:49.848358: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-27 02:30:49.848366: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-27 02:30:50.770778: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-27 02:30:50.770819: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-27 02:30:50.770825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2019] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-03-27 02:30:50.770944: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-27 02:30:50.771087: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 45757 MB memory: -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 ━━━━━━━━━━━━━━━━━━━━ 4s 0us/step
Epoch 1/50
2024-03-27 02:31:14.860698: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:465] Loaded cuDNN version 8906
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 403ms/step - accuracy: 0.0193 - loss: 5.92562024-03-27 02:31:28.659203: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:31:28.659288: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 33s 628ms/step - accuracy: 0.0205 - loss: 5.8367
Epoch 2/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 403ms/step - accuracy: 0.0669 - loss: 4.20572024-03-27 02:31:36.654512: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:31:36.654549: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 382ms/step - accuracy: 0.0686 - loss: 4.1907
Epoch 3/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 405ms/step - accuracy: 0.1232 - loss: 3.81262024-03-27 02:31:41.708714: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:31:41.708885: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 382ms/step - accuracy: 0.1249 - loss: 3.8003
Epoch 4/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 413ms/step - accuracy: 0.1859 - loss: 3.43082024-03-27 02:31:46.860052: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:31:46.860109: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 393ms/step - accuracy: 0.1870 - loss: 3.4237
Epoch 5/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 407ms/step - accuracy: 0.2577 - loss: 3.06172024-03-27 02:31:51.956526: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:31:51.956592: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 385ms/step - accuracy: 0.2583 - loss: 3.0574
Epoch 6/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 408ms/step - accuracy: 0.3328 - loss: 2.66182024-03-27 02:31:57.061256: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:31:57.061301: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 386ms/step - accuracy: 0.3333 - loss: 2.6603
Epoch 7/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 405ms/step - accuracy: 0.4023 - loss: 2.32382024-03-27 02:32:02.117569: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:02.117644: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 385ms/step - accuracy: 0.4027 - loss: 2.3221
Epoch 8/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 405ms/step - accuracy: 0.4754 - loss: 2.00592024-03-27 02:32:07.214123: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:07.214181: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 385ms/step - accuracy: 0.4749 - loss: 2.0077
Epoch 9/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 408ms/step - accuracy: 0.5448 - loss: 1.71792024-03-27 02:32:12.314759: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:12.314809: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 388ms/step - accuracy: 0.5432 - loss: 1.7226
Epoch 10/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 406ms/step - accuracy: 0.5986 - loss: 1.50712024-03-27 02:32:17.391075: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:17.391118: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 384ms/step - accuracy: 0.5971 - loss: 1.5099
Epoch 11/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 412ms/step - accuracy: 0.6382 - loss: 1.30302024-03-27 02:32:22.558514: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:22.558562: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 389ms/step - accuracy: 0.6373 - loss: 1.3070
Epoch 12/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 407ms/step - accuracy: 0.6792 - loss: 1.16252024-03-27 02:32:27.620701: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:27.620755: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 385ms/step - accuracy: 0.6771 - loss: 1.1689
Epoch 13/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 410ms/step - accuracy: 0.7221 - loss: 0.99232024-03-27 02:32:32.731691: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:32.731732: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 387ms/step - accuracy: 0.7202 - loss: 0.9970
Epoch 14/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 410ms/step - accuracy: 0.7718 - loss: 0.80202024-03-27 02:32:37.839133: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:37.839181: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 387ms/step - accuracy: 0.7716 - loss: 0.7997
Epoch 15/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 409ms/step - accuracy: 0.8208 - loss: 0.62872024-03-27 02:32:42.915554: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:42.915601: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 387ms/step - accuracy: 0.8197 - loss: 0.6313
Epoch 16/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 408ms/step - accuracy: 0.8508 - loss: 0.51042024-03-27 02:32:48.022487: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:48.022564: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 386ms/step - accuracy: 0.8493 - loss: 0.5163
Epoch 17/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 410ms/step - accuracy: 0.8664 - loss: 0.45142024-03-27 02:32:53.150208: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:53.150257: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 388ms/step - accuracy: 0.8657 - loss: 0.4540
Epoch 18/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 414ms/step - accuracy: 0.8464 - loss: 0.59282024-03-27 02:32:58.335544: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:32:58.335582: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 391ms/step - accuracy: 0.8441 - loss: 0.5975
Epoch 19/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 414ms/step - accuracy: 0.8598 - loss: 0.48782024-03-27 02:33:03.497457: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:33:03.497506: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 391ms/step - accuracy: 0.8591 - loss: 0.4901
Epoch 20/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 414ms/step - accuracy: 0.9000 - loss: 0.34322024-03-27 02:33:08.674424: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:33:08.674492: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 392ms/step - accuracy: 0.8992 - loss: 0.3495
Epoch 21/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 415ms/step - accuracy: 0.8625 - loss: 0.44792024-03-27 02:33:13.844758: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:33:13.844808: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 391ms/step - accuracy: 0.8596 - loss: 0.4574
Epoch 22/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 415ms/step - accuracy: 0.8769 - loss: 0.41382024-03-27 02:33:18.990575: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:33:18.990633: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 391ms/step - accuracy: 0.8764 - loss: 0.4148
Epoch 23/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 412ms/step - accuracy: 0.9143 - loss: 0.29442024-03-27 02:33:24.094218: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:33:24.094496: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 390ms/step - accuracy: 0.9138 - loss: 0.2950
Epoch 24/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 411ms/step - accuracy: 0.9403 - loss: 0.21402024-03-27 02:33:34.650706: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:33:34.650759: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 11s 393ms/step - accuracy: 0.9399 - loss: 0.2150
Epoch 25/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 409ms/step - accuracy: 0.9603 - loss: 0.14942024-03-27 02:33:39.821568: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:33:39.821618: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 387ms/step - accuracy: 0.9599 - loss: 0.1508
Epoch 26/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 408ms/step - accuracy: 0.9683 - loss: 0.12572024-03-27 02:33:44.905229: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:33:44.905540: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 387ms/step - accuracy: 0.9678 - loss: 0.1270
Epoch 27/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 411ms/step - accuracy: 0.9456 - loss: 0.19372024-03-27 02:33:50.058238: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:33:50.058289: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 389ms/step - accuracy: 0.9443 - loss: 0.1976
Epoch 28/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 413ms/step - accuracy: 0.9503 - loss: 0.16942024-03-27 02:33:55.204099: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:33:55.204145: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 390ms/step - accuracy: 0.9498 - loss: 0.1709
Epoch 29/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 408ms/step - accuracy: 0.9561 - loss: 0.14962024-03-27 02:34:00.296730: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:00.296778: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 387ms/step - accuracy: 0.9555 - loss: 0.1513
Epoch 30/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 407ms/step - accuracy: 0.9656 - loss: 0.11772024-03-27 02:34:05.420959: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:05.421080: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 388ms/step - accuracy: 0.9649 - loss: 0.1194
Epoch 31/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 408ms/step - accuracy: 0.9681 - loss: 0.10822024-03-27 02:34:10.570976: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:10.571048: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 385ms/step - accuracy: 0.9677 - loss: 0.1115
Epoch 32/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 412ms/step - accuracy: 0.9665 - loss: 0.11612024-03-27 02:34:15.688882: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:15.688937: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 389ms/step - accuracy: 0.9658 - loss: 0.1191
Epoch 33/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 412ms/step - accuracy: 0.9611 - loss: 0.13662024-03-27 02:34:20.806693: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:20.806731: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 389ms/step - accuracy: 0.9607 - loss: 0.1399
Epoch 34/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 414ms/step - accuracy: 0.9641 - loss: 0.13682024-03-27 02:34:25.956910: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:25.956985: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 390ms/step - accuracy: 0.9637 - loss: 0.1377
Epoch 35/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 412ms/step - accuracy: 0.9665 - loss: 0.13292024-03-27 02:34:31.097990: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:31.098036: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 391ms/step - accuracy: 0.9657 - loss: 0.1384
Epoch 36/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 410ms/step - accuracy: 0.9491 - loss: 0.16552024-03-27 02:34:36.234712: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:36.234762: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 391ms/step - accuracy: 0.9484 - loss: 0.1721
Epoch 37/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 413ms/step - accuracy: 0.9488 - loss: 0.19262024-03-27 02:34:41.419997: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:41.420053: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 391ms/step - accuracy: 0.9479 - loss: 0.1954
Epoch 38/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 410ms/step - accuracy: 0.9291 - loss: 0.23512024-03-27 02:34:46.521615: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:46.521659: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 387ms/step - accuracy: 0.9258 - loss: 0.2473
Epoch 39/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 414ms/step - accuracy: 0.8914 - loss: 0.35372024-03-27 02:34:51.658518: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:51.658567: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 395ms/step - accuracy: 0.8907 - loss: 0.3609
Epoch 40/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 414ms/step - accuracy: 0.9146 - loss: 0.27362024-03-27 02:34:56.856334: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:34:56.856662: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 390ms/step - accuracy: 0.9145 - loss: 0.2747
Epoch 41/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 410ms/step - accuracy: 0.9365 - loss: 0.20332024-03-27 02:35:01.959477: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:35:01.959830: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 388ms/step - accuracy: 0.9363 - loss: 0.2042
Epoch 42/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 411ms/step - accuracy: 0.9585 - loss: 0.14502024-03-27 02:35:07.080055: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:35:07.080126: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 389ms/step - accuracy: 0.9583 - loss: 0.1474
Epoch 43/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 413ms/step - accuracy: 0.9714 - loss: 0.10292024-03-27 02:35:12.254993: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:35:12.255123: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 390ms/step - accuracy: 0.9711 - loss: 0.1036
Epoch 44/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 412ms/step - accuracy: 0.9811 - loss: 0.09602024-03-27 02:35:17.437499: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:35:17.437552: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 389ms/step - accuracy: 0.9809 - loss: 0.0956
Epoch 45/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 419ms/step - accuracy: 0.9842 - loss: 0.05802024-03-27 02:35:22.730836: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:35:22.730869: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 397ms/step - accuracy: 0.9840 - loss: 0.0587
Epoch 46/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 415ms/step - accuracy: 0.9890 - loss: 0.04022024-03-27 02:35:27.916598: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:35:27.916671: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 394ms/step - accuracy: 0.9889 - loss: 0.0407
Epoch 47/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 415ms/step - accuracy: 0.9934 - loss: 0.03182024-03-27 02:35:33.106803: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:35:33.106851: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 394ms/step - accuracy: 0.9933 - loss: 0.0351
Epoch 48/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 408ms/step - accuracy: 0.9959 - loss: 0.02112024-03-27 02:35:38.235103: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:35:38.235171: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 385ms/step - accuracy: 0.9958 - loss: 0.0213
Epoch 49/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 412ms/step - accuracy: 0.9967 - loss: 0.01532024-03-27 02:35:43.396051: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:35:43.396157: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 388ms/step - accuracy: 0.9967 - loss: 0.0155
Epoch 50/50
12/13 ━━━━━━━━━━━━━━━━━━━━ 0s 415ms/step - accuracy: 0.9960 - loss: 0.03992024-03-27 02:35:48.538435: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
2024-03-27 02:35:48.538483: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
[[RemoteCall]]
13/13 ━━━━━━━━━━━━━━━━━━━━ 5s 392ms/step - accuracy: 0.9958 - loss: 0.0388
thermal throttling
idle
full power
capacitor squeak 1 sec at 49/50 epoch 2048 batch
Gemma 7b on A6000 32G of 48G
michael@14900c MINGW64 /c/wse_github/ObrienlabsDev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
tokenizer_config.json: 100%|##########| 1.11k/1.11k [00:00<?, ?B/s]
C:\optpython312\Lib\site-packages\huggingface_hub\file_download.py:149: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\michael\.cache\huggingface\hub\models--google--gemma-7b. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
tokenizer.model: 100%|##########| 4.24M/4.24M [00:00<00:00, 29.2MB/s]
tokenizer.json: 100%|##########| 17.5M/17.5M [00:00<00:00, 87.2MB/s]
special_tokens_map.json: 100%|##########| 555/555 [00:00<?, ?B/s]
config.json: 100%|##########| 629/629 [00:00<?, ?B/s]
model.safetensors.index.json: 100%|##########| 20.9k/20.9k [00:00<00:00, 42.0MB/s]
model-00001-of-00004.safetensors: 100%|##########| 5.00G/5.00G [00:54<00:00, 91.2MB/s]
model-00002-of-00004.safetensors: 100%|##########| 4.98G/4.98G [00:55<00:00, 90.4MB/s]
model-00003-of-00004.safetensors: 100%|##########| 4.98G/4.98G [00:54<00:00, 90.8MB/s]
model-00004-of-00004.safetensors: 100%|##########| 2.11G/2.11G [00:23<00:00, 89.8MB/s]
Downloading shards: 100%|##########| 4/4 [03:08<00:00, 47.22s/it]0:23<00:00, 87.6MB/s]
Loading checkpoint shards: 100%|##########| 4/4 [00:10<00:00, 2.57s/it]
generation_config.json: 100%|##########| 137/137 [00:00<00:00, 275kB/s]
C:\optpython312\Lib\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
genarate start: 22:53:09
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, when a neutron star collapses, it undergoes a process called gravitational collapse, which causes the star to rapidly lose mass and density. This process releases a tremendous amount of energy, which can cause the star to explode in a supernova. During the supernova, the star's core undergoes a process called the r-process, which is responsible for creating heavy elements like gold. The r-process occurs when neutrons are added to atomic nuclei, causing them to become unstable and undergo beta decay. This process continues until the nucleus reaches a stable state, which is usually a heavy element like gold.
Step 2/2
The ratio of gold created during the r-process is not well understood, as it depends on a variety of factors, including the mass and density of the star, the amount of energy released during the supernova, and the specific conditions of the r-process. However, it is believed that the r-process is responsible for creating most of the heavy elements in the universe, including gold.<eos>
end 22:53:24
36G vram
michael@14900c MINGW64 /c/wse_github/ObrienlabsDev/machine-learning/environments/windows/src/google-gemma (main)
$ nvidia-smi
Tue Mar 26 23:00:04 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 537.99 Driver Version: 537.99 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A6000 WDDM | 00000000:01:00.0 Off | Off |
| 30% 42C P8 6W / 300W | 0MiB / 49140MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
michael@14900c MINGW64 ~
$ cd /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma/
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ python --version
Python 3.12.3
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ cat gemma-gpu.py
import os
# default dual GPU - either PCIe bus or NVidia bus - slowdowns
#os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
# specific GPU - model must fit entierely in memory RTX-3500 ada = 12G, A4000=16G, A4500=20, A6000=48, 4000 ada = 20, 5000 ada = 32, 6000 ada = 48
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
from transformers import AutoTokenizer, AutoModelForCausalLM
from datetime import datetime
access_token='hf_cfTP...QqH'
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b", token=access_token)
# GPU
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", device_map="auto", token=access_token)
# CPU
#model = AutoModelForCausalLM.from_pretrained("google/gemma-2b",token=access_token)
input_text = "how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process."
time_start = datetime.now().strftime("%H:%M:%S")
print("genarate start: ", datetime.now().strftime("%H:%M:%S"))
# GPU
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
# CPU
#input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids,
max_new_tokens=10000)
print(tokenizer.decode(outputs[0]))
print("end", datetime.now().strftime("%H:%M:%S"))
time_end = datetime.now().strftime("%H:%M:%S")
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
Traceback (most recent call last):
File "C:\wse_github\obrienlabsdev\machine-learning\environments\windows\src\google-gemma\gemma-gpu.py", line 7, in <module>
from transformers import AutoTokenizer, AutoModelForCausalLM
ModuleNotFoundError: No module named 'transformers'
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ vi gemma-gpu.py
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ pip install -U torch
Collecting torch
Downloading torch-2.2.2-cp312-cp312-win_amd64.whl.metadata (26 kB)
Collecting filelock (from torch)
Downloading filelock-3.13.4-py3-none-any.whl.metadata (2.8 kB)
Collecting typing-extensions>=4.8.0 (from torch)
Downloading typing_extensions-4.11.0-py3-none-any.whl.metadata (3.0 kB)
Collecting sympy (from torch)
Downloading sympy-1.12-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch)
Downloading networkx-3.3-py3-none-any.whl.metadata (5.1 kB)
Collecting jinja2 (from torch)
Downloading Jinja2-3.1.3-py3-none-any.whl.metadata (3.3 kB)
Collecting fsspec (from torch)
Downloading fsspec-2024.3.1-py3-none-any.whl.metadata (6.8 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch)
Downloading MarkupSafe-2.1.5-cp312-cp312-win_amd64.whl.metadata (3.1 kB)
Collecting mpmath>=0.19 (from sympy->torch)
Downloading mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)
Downloading torch-2.2.2-cp312-cp312-win_amd64.whl (198.5 MB)
--------------------------------------- 198.5/198.5 MB 46.7 MB/s eta 0:00:00
Downloading typing_extensions-4.11.0-py3-none-any.whl (34 kB)
Downloading filelock-3.13.4-py3-none-any.whl (11 kB)
Downloading fsspec-2024.3.1-py3-none-any.whl (171 kB)
---------------------------------------- 172.0/172.0 kB ? eta 0:00:00
Downloading Jinja2-3.1.3-py3-none-any.whl (133 kB)
---------------------------------------- 133.2/133.2 kB 7.7 MB/s eta 0:00:00
Downloading networkx-3.3-py3-none-any.whl (1.7 MB)
---------------------------------------- 1.7/1.7 MB 105.8 MB/s eta 0:00:00
Downloading sympy-1.12-py3-none-any.whl (5.7 MB)
---------------------------------------- 5.7/5.7 MB 122.0 MB/s eta 0:00:00
Downloading MarkupSafe-2.1.5-cp312-cp312-win_amd64.whl (17 kB)
Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
---------------------------------------- 536.2/536.2 kB ? eta 0:00:00
Installing collected packages: mpmath, typing-extensions, sympy, networkx, MarkupSafe, fsspec, filelock, jinja2, torch
Successfully installed MarkupSafe-2.1.5 filelock-3.13.4 fsspec-2024.3.1 jinja2-3.1.3 mpmath-1.3.0 networkx-3.3 sympy-1.12 torch-2.2.2 typing-extensions-4.11.0
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ pip install -U transformers
Collecting transformers
Downloading transformers-4.40.0-py3-none-any.whl.metadata (137 kB)
-------------------------------------- 137.6/137.6 kB 1.6 MB/s eta 0:00:00
Requirement already satisfied: filelock in c:\opt\python312\lib\site-packages (from transformers) (3.13.4)
Collecting huggingface-hub<1.0,>=0.19.3 (from transformers)
Downloading huggingface_hub-0.22.2-py3-none-any.whl.metadata (12 kB)
Collecting numpy>=1.17 (from transformers)
Downloading numpy-1.26.4-cp312-cp312-win_amd64.whl.metadata (61 kB)
---------------------------------------- 61.0/61.0 kB 3.2 MB/s eta 0:00:00
Collecting packaging>=20.0 (from transformers)
Downloading packaging-24.0-py3-none-any.whl.metadata (3.2 kB)
Collecting pyyaml>=5.1 (from transformers)
Downloading PyYAML-6.0.1-cp312-cp312-win_amd64.whl.metadata (2.1 kB)
Collecting regex!=2019.12.17 (from transformers)
Downloading regex-2024.4.16-cp312-cp312-win_amd64.whl.metadata (41 kB)
---------------------------------------- 42.0/42.0 kB 2.1 MB/s eta 0:00:00
Collecting requests (from transformers)
Downloading requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting tokenizers<0.20,>=0.19 (from transformers)
Downloading tokenizers-0.19.1-cp312-none-win_amd64.whl.metadata (6.9 kB)
Collecting safetensors>=0.4.1 (from transformers)
Downloading safetensors-0.4.3-cp312-none-win_amd64.whl.metadata (3.9 kB)
Collecting tqdm>=4.27 (from transformers)
Downloading tqdm-4.66.2-py3-none-any.whl.metadata (57 kB)
---------------------------------------- 57.6/57.6 kB 3.2 MB/s eta 0:00:00
Requirement already satisfied: fsspec>=2023.5.0 in c:\opt\python312\lib\site-packages (from huggingface-hub<1.0,>=0.19.3->transformers) (2024.3.1)
Requirement already satisfied: typing-extensions>=3.7.4.3 in c:\opt\python312\lib\site-packages (from huggingface-hub<1.0,>=0.19.3->transformers) (4.11.0)
Collecting colorama (from tqdm>=4.27->transformers)
Downloading colorama-0.4.6-py2.py3-none-any.whl.metadata (17 kB)
Collecting charset-normalizer<4,>=2 (from requests->transformers)
Downloading charset_normalizer-3.3.2-cp312-cp312-win_amd64.whl.metadata (34 kB)
Collecting idna<4,>=2.5 (from requests->transformers)
Downloading idna-3.7-py3-none-any.whl.metadata (9.9 kB)
Collecting urllib3<3,>=1.21.1 (from requests->transformers)
Downloading urllib3-2.2.1-py3-none-any.whl.metadata (6.4 kB)
Collecting certifi>=2017.4.17 (from requests->transformers)
Downloading certifi-2024.2.2-py3-none-any.whl.metadata (2.2 kB)
Downloading transformers-4.40.0-py3-none-any.whl (9.0 MB)
---------------------------------------- 9.0/9.0 MB 16.9 MB/s eta 0:00:00
Downloading huggingface_hub-0.22.2-py3-none-any.whl (388 kB)
--------------------------------------- 388.9/388.9 kB 25.2 MB/s eta 0:00:00
Downloading numpy-1.26.4-cp312-cp312-win_amd64.whl (15.5 MB)
---------------------------------------- 15.5/15.5 MB 50.4 MB/s eta 0:00:00
Downloading packaging-24.0-py3-none-any.whl (53 kB)
---------------------------------------- 53.5/53.5 kB 2.7 MB/s eta 0:00:00
Downloading PyYAML-6.0.1-cp312-cp312-win_amd64.whl (138 kB)
---------------------------------------- 138.7/138.7 kB ? eta 0:00:00
Downloading regex-2024.4.16-cp312-cp312-win_amd64.whl (268 kB)
--------------------------------------- 268.4/268.4 kB 17.2 MB/s eta 0:00:00
Downloading safetensors-0.4.3-cp312-none-win_amd64.whl (289 kB)
--------------------------------------- 289.4/289.4 kB 18.6 MB/s eta 0:00:00
Downloading tokenizers-0.19.1-cp312-none-win_amd64.whl (2.2 MB)
---------------------------------------- 2.2/2.2 MB 71.1 MB/s eta 0:00:00
Downloading tqdm-4.66.2-py3-none-any.whl (78 kB)
---------------------------------------- 78.3/78.3 kB ? eta 0:00:00
Downloading requests-2.31.0-py3-none-any.whl (62 kB)
---------------------------------------- 62.6/62.6 kB ? eta 0:00:00
Downloading certifi-2024.2.2-py3-none-any.whl (163 kB)
---------------------------------------- 163.8/163.8 kB ? eta 0:00:00
Downloading charset_normalizer-3.3.2-cp312-cp312-win_amd64.whl (100 kB)
---------------------------------------- 100.4/100.4 kB ? eta 0:00:00
Downloading idna-3.7-py3-none-any.whl (66 kB)
---------------------------------------- 66.8/66.8 kB ? eta 0:00:00
Downloading urllib3-2.2.1-py3-none-any.whl (121 kB)
---------------------------------------- 121.1/121.1 kB 7.4 MB/s eta 0:00:00
Downloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Installing collected packages: urllib3, safetensors, regex, pyyaml, packaging, numpy, idna, colorama, charset-normalizer, certifi, tqdm, requests, huggingface-hub, tokenizers, transformers
Successfully installed certifi-2024.2.2 charset-normalizer-3.3.2 colorama-0.4.6 huggingface-hub-0.22.2 idna-3.7 numpy-1.26.4 packaging-24.0 pyyaml-6.0.1 regex-2024.4.16 requests-2.31.0 safetensors-0.4.3 tokenizers-0.19.1 tqdm-4.66.2 transformers-4.40.0 urllib3-2.2.1
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
C:\opt\Python312\Lib\site-packages\huggingface_hub\file_download.py:148: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\michael\.cache\huggingface\hub\models--google--gemma-7b. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Traceback (most recent call last):
File "C:\wse_github\obrienlabsdev\machine-learning\environments\windows\src\google-gemma\gemma-gpu.py", line 16, in <module>
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", device_map="auto", token=access_token)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\opt\Python312\Lib\site-packages\transformers\models\auto\auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\opt\Python312\Lib\site-packages\transformers\modeling_utils.py", line 3086, in from_pretrained
raise ImportError(
ImportError: Using `low_cpu_mem_usage=True` or a `device_map` requires Accelerate: `pip install accelerate`
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ sudo python gemma-gpu.py
I have not installed visual studio yet
https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
forgot
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:28:36_Pacific_Standard_Time_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ pip install accelerate
Collecting accelerate
Downloading accelerate-0.29.3-py3-none-any.whl.metadata (18 kB)
Requirement already satisfied: numpy>=1.17 in c:\opt\python312\lib\site-packages (from accelerate) (1.26.4)
Requirement already satisfied: packaging>=20.0 in c:\opt\python312\lib\site-packages (from accelerate) (24.0)
Collecting psutil (from accelerate)
Downloading psutil-5.9.8-cp37-abi3-win_amd64.whl.metadata (22 kB)
Requirement already satisfied: pyyaml in c:\opt\python312\lib\site-packages (from accelerate) (6.0.1)
Requirement already satisfied: torch>=1.10.0 in c:\opt\python312\lib\site-packages (from accelerate) (2.2.2)
Requirement already satisfied: huggingface-hub in c:\opt\python312\lib\site-packages (from accelerate) (0.22.2)
Requirement already satisfied: safetensors>=0.3.1 in c:\opt\python312\lib\site-packages (from accelerate) (0.4.3)
Requirement already satisfied: filelock in c:\opt\python312\lib\site-packages (from torch>=1.10.0->accelerate) (3.13.4)
Requirement already satisfied: typing-extensions>=4.8.0 in c:\opt\python312\lib\site-packages (from torch>=1.10.0->accelerate) (4.11.0)
Requirement already satisfied: sympy in c:\opt\python312\lib\site-packages (from torch>=1.10.0->accelerate) (1.12)
Requirement already satisfied: networkx in c:\opt\python312\lib\site-packages (from torch>=1.10.0->accelerate) (3.3)
Requirement already satisfied: jinja2 in c:\opt\python312\lib\site-packages (from torch>=1.10.0->accelerate) (3.1.3)
Requirement already satisfied: fsspec in c:\opt\python312\lib\site-packages (from torch>=1.10.0->accelerate) (2024.3.1)
Requirement already satisfied: requests in c:\opt\python312\lib\site-packages (from huggingface-hub->accelerate) (2.31.0)
Requirement already satisfied: tqdm>=4.42.1 in c:\opt\python312\lib\site-packages (from huggingface-hub->accelerate) (4.66.2)
Requirement already satisfied: colorama in c:\opt\python312\lib\site-packages (from tqdm>=4.42.1->huggingface-hub->accelerate) (0.4.6)
Requirement already satisfied: MarkupSafe>=2.0 in c:\opt\python312\lib\site-packages (from jinja2->torch>=1.10.0->accelerate) (2.1.5)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\opt\python312\lib\site-packages (from requests->huggingface-hub->accelerate) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\opt\python312\lib\site-packages (from requests->huggingface-hub->accelerate) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\opt\python312\lib\site-packages (from requests->huggingface-hub->accelerate) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in c:\opt\python312\lib\site-packages (from requests->huggingface-hub->accelerate) (2024.2.2)
Requirement already satisfied: mpmath>=0.19 in c:\opt\python312\lib\site-packages (from sympy->torch>=1.10.0->accelerate) (1.3.0)
Downloading accelerate-0.29.3-py3-none-any.whl (297 kB)
---------------------------------------- 297.6/297.6 kB 3.7 MB/s eta 0:00:00
Downloading psutil-5.9.8-cp37-abi3-win_amd64.whl (255 kB)
--------------------------------------- 255.1/255.1 kB 15.3 MB/s eta 0:00:00
Installing collected packages: psutil, accelerate
Successfully installed accelerate-0.29.3 psutil-5.9.8
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
Looking in indexes: https://download.pytorch.org/whl/cu124
Requirement already satisfied: torch in c:\opt\python312\lib\site-packages (2.2.2)
ERROR: Could not find a version that satisfies the requirement torchvision (from versions: none)
ERROR: No matching distribution found for torchvision
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
Looking in indexes: https://download.pytorch.org/whl/cu121
Requirement already satisfied: torch in c:\opt\python312\lib\site-packages (2.2.2)
Collecting torchvision
Downloading https://download.pytorch.org/whl/cu121/torchvision-0.17.2%2Bcu121-cp312-cp312-win_amd64.whl (5.7 MB)
---------------------------------------- 5.7/5.7 MB 40.1 MB/s eta 0:00:00
Collecting torchaudio
Downloading https://download.pytorch.org/whl/cu121/torchaudio-2.2.2%2Bcu121-cp312-cp312-win_amd64.whl (4.0 MB)
---------------------------------------- 4.0/4.0 MB 85.9 MB/s eta 0:00:00
Requirement already satisfied: filelock in c:\opt\python312\lib\site-packages (from torch) (3.13.4)
Requirement already satisfied: typing-extensions>=4.8.0 in c:\opt\python312\lib\site-packages (from torch) (4.11.0)
Requirement already satisfied: sympy in c:\opt\python312\lib\site-packages (from torch) (1.12)
Requirement already satisfied: networkx in c:\opt\python312\lib\site-packages (from torch) (3.3)
Requirement already satisfied: jinja2 in c:\opt\python312\lib\site-packages (from torch) (3.1.3)
Requirement already satisfied: fsspec in c:\opt\python312\lib\site-packages (from torch) (2024.3.1)
Requirement already satisfied: numpy in c:\opt\python312\lib\site-packages (from torchvision) (1.26.4)
Collecting torch
Downloading https://download.pytorch.org/whl/cu121/torch-2.2.2%2Bcu121-cp312-cp312-win_amd64.whl (2454.8 MB)
---------------------------------------- 2.5/2.5 GB 6.2 MB/s eta 0:00:00
Collecting pillow!=8.3.*,>=5.3.0 (from torchvision)
Downloading https://download.pytorch.org/whl/pillow-10.2.0-cp312-cp312-win_amd64.whl (2.6 MB)
---------------------------------------- 2.6/2.6 MB 84.2 MB/s eta 0:00:00
Requirement already satisfied: MarkupSafe>=2.0 in c:\opt\python312\lib\site-packages (from jinja2->torch) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in c:\opt\python312\lib\site-packages (from sympy->torch) (1.3.0)
Installing collected packages: pillow, torch, torchvision, torchaudio
Attempting uninstall: torch
Found existing installation: torch 2.2.2
Uninstalling torch-2.2.2:
Successfully uninstalled torch-2.2.2
Successfully installed pillow-10.2.0 torch-2.2.2+cu121 torchaudio-2.2.2+cu121 torchvision-0.17.2+cu121
2115
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
Downloading shards: 100%|##########| 4/4 [02:50<00:00, 42.56s/it]
Gemma's activation function should be approximate GeLU and not exact GeLU.
Changing the activation function to `gelu_pytorch_tanh`.if you want to use the legacy `gelu`, edit the `model.config` to set `hidden_activation=gelu` instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details.
Loading checkpoint shards: 100%|##########| 4/4 [00:05<00:00, 1.26s/it]
C:\opt\Python312\Lib\site-packages\transformers\models\gemma\modeling_gemma.py:575: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
genarate start: 21:18:23
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, when a neutron star collapses, it undergoes a process called gravitational collapse, which causes the star to rapidly lose mass and density. This process releases a tremendous amount of energy, which can cause the star to explode in a supernova. During the supernova, the star's core undergoes a process called the r-process, which is responsible for creating heavy elements like gold. The r-process occurs when neutrons are added to atomic nuclei, causing them to become unstable and undergo beta decay. This process continues until the nucleus reaches a stable state, which is usually a heavy element like gold.
Step 2/2
The ratio of gold created during the r-process is not well understood, as it depends on a variety of factors, including the mass and density of the star, the amount of energy released during the supernova, and the specific conditions of the r-process. However, it is believed that the r-process is responsible for creating most of the heavy elements in the universe, including gold.<eos>
end 21:18:36
michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
Gemma's activation function should be approximate GeLU and not exact GeLU.
Changing the activation function to `gelu_pytorch_tanh`.if you want to use the legacy `gelu`, edit the `model.config` to set `hidden_activation=gelu` instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details.
Loading checkpoint shards: 100%|##########| 4/4 [00:05<00:00, 1.28s/it]
C:\opt\Python312\Lib\site-packages\transformers\models\gemma\modeling_gemma.py:575: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
genarate start: 21:21:54
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, when a neutron star collapses, it undergoes a process called gravitational collapse, which causes the star to rapidly lose mass and density. This process releases a tremendous amount of energy, which can cause the star to explode in a supernova. During the supernova, the star's core undergoes a process called the r-process, which is responsible for creating heavy elements like gold. The r-process occurs when neutrons are added to atomic nuclei, causing them to become unstable and undergo beta decay. This process continues until the nucleus reaches a stable state, which is usually a heavy element like gold.
Step 2/2
The ratio of gold created during the r-process is not well understood, as it depends on a variety of factors, including the mass and density of the star, the amount of energy released during the supernova, and the specific conditions of the r-process. However, it is believed that the r-process is responsible for creating most of the heavy elements in the universe, including gold.<eos>
end 21:22:06
16 sec
michael@14900c MINGW64 ~
$ nvidia-smi
Fri Apr 19 21:23:02 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 551.86 Driver Version: 551.86 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX A6000 WDDM | 00000000:01:00.0 Off | Off |
| 38% 71C P2 263W / 300W | 34075MiB / 49140MiB | 98% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1676 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A |
| 0 N/A N/A 4420 C+G ...on\123.0.2420.97\msedgewebview2.exe N/A |
| 0 N/A N/A 7844 C C:\opt\Python312\python.exe N/A |
| 0 N/A N/A 8680 C+G C:\Windows\explorer.exe N/A |
| 0 N/A N/A 9372 C+G ...2txyewy\StartMenuExperienceHost.exe N/A |
| 0 N/A N/A 9396 C+G ...nt.CBS_cw5n1h2txyewy\SearchHost.exe N/A |
| 0 N/A N/A 10568 C+G ...crosoft\Edge\Application\msedge.exe N/A |
| 0 N/A N/A 11124 C+G ...siveControlPanel\SystemSettings.exe N/A |
| 0 N/A N/A 11820 C+G ...cal\Microsoft\OneDrive\OneDrive.exe N/A |
| 0 N/A N/A 12232 C+G ...ekyb3d8bbwe\PhoneExperienceHost.exe N/A |
| 0 N/A N/A 13936 C+G ...oogle\Chrome\Application\chrome.exe N/A |
| 0 N/A N/A 17968 C+G ...\Docker\frontend\Docker Desktop.exe N/A |
gemma 7b on dual A4500
ichael@13900d MINGW64 ~
$ nvidia-smi
Sat Apr 20 01:12:02 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 546.12 Driver Version: 546.12 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A4500 WDDM | 00000000:01:00.0 Off | Off |
| 34% 65C P2 97W / 200W | 18597MiB / 20470MiB | 71% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA RTX A4500 WDDM | 00000000:02:00.0 Off | Off |
| 30% 62C P2 87W / 200W | 15537MiB / 20470MiB | 99% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 12596 C C:\opt\Python310\python.exe N/A |
| 1 N/A N/A 12596 C C:\opt\Python310\python.exe N/A |
+---------------------------------------------------------------------------------------+
michael@13900d MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
tokenizer_config.json: 100%|███████████████████████████████████████████| 33.6k/33.6k [00:00<00:00, 9.59MB/s]
C:\opt\Python310\lib\site-packages\huggingface_hub\file_download.py:149: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\michael\.cache\huggingface\hub\models--google--gemma-7b. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
tokenizer.model: 100%|█████████████████████████████████████████████████| 4.24M/4.24M [00:00<00:00, 30.7MB/s]
tokenizer.json: 100%|███████████████████████████████████████████████████| 17.5M/17.5M [00:00<00:00, 111MB/s]
special_tokens_map.json: 100%|█████████████████████████████████████████████| 636/636 [00:00<00:00, 1.28MB/s]
config.json: 100%|█████████████████████████████████████████████████████████████████| 629/629 [00:00<?, ?B/s]
model.safetensors.index.json: 100%|████████████████████████████████████| 20.9k/20.9k [00:00<00:00, 20.9MB/s]
model-00001-of-00004.safetensors: 100%|█████████████████████████████████| 5.00G/5.00G [00:46<00:00, 108MB/s]
model-00002-of-00004.safetensors: 100%|█████████████████████████████████| 4.98G/4.98G [00:45<00:00, 110MB/s]
model-00003-of-00004.safetensors: 100%|█████████████████████████████████| 4.98G/4.98G [00:45<00:00, 109MB/s]
model-00004-of-00004.safetensors: 100%|█████████████████████████████████| 2.11G/2.11G [00:19<00:00, 109MB/s]
Downloading shards: 100%|█████████████████████████████████████████████████████| 4/4 [02:36<00:00, 39.23s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████| 4/4 [00:07<00:00, 1.95s/it]
generation_config.json: 100%|███████████████████████████████████████████████| 137/137 [00:00<00:00, 273kB/s]
genarate start: 01:03:20
C:\opt\Python310\lib\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, when a neutron star collapses, it undergoes a process called gravitational collapse, which causes the star to rapidly lose mass and density. This process releases a tremendous amount of energy, which can cause the star to explode in a supernova. During the supernova, the star's core undergoes a process called the r-process, which is responsible for creating heavy elements like gold. The r-process occurs when neutrons are added to atomic nuclei, causing them to become unstable and undergo beta decay. This process continues until the nucleus reaches a stable state, which is usually a heavy element like gold.
Step 2/2
The ratio of gold created during the r-process is not well understood, as it depends on a variety of factors, including the mass and density of the star, the amount of energy released during the supernova, and the specific conditions of the r-process. However, it is believed that the r-process is responsible for creating most of the heavy elements in the universe, including gold.<eos>
end 01:05:09
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
#os.environ["CUDA_VISIBLE_DEVICES"] = "0"
from transformers import AutoTokenizer, AutoModelForCausalLM
from datetime import datetime
#access_token='hf_cfTP...XCQqH'
access_token='hf_cfTP....QqH'
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b", token=access_token)
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", device_map="auto", token=access_token)
input_text = "how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process."
time_start = datetime.now().strftime("%H:%M:%S")
print("genarate start: ", datetime.now().strftime("%H:%M:%S"))
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids,
max_new_tokens=10000)
print(tokenizer.decode(outputs[0]))
print("end", datetime.now().strftime("%H:%M:%S"))
time_end = datetime.now().strftime("%H:%M:%S")
michael@13900b MINGW64 ~
$ nvidia-smi
Sat Apr 20 09:08:17 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 552.12 Driver Version: 552.12 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 WDDM | 00000000:01:00.0 Off | Off |
| 0% 39C P2 101W / 480W | 18832MiB / 24564MiB | 87% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 4090 WDDM | 00000000:02:00.0 On | Off |
| 0% 48C P2 99W / 480W | 16678MiB / 24564MiB | 97% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 25976 C C:\opt\miniconda3\python.exe N/A |
| 1 N/A N/A 1288 C+G ...on\123.0.2420.97\msedgewebview2.exe N/A |
| 1 N/A N/A 3568 C+G ...2txyewy\StartMenuExperienceHost.exe N/A |
| 1 N/A N/A 7700 C+G ...8bbwe\SnippingTool\SnippingTool.exe N/A |
| 1 N/A N/A 8504 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A |
| 1 N/A N/A 12632 C+G C:\Windows\explorer.exe N/A |
| 1 N/A N/A 12700 C+G ...nt.CBS_cw5n1h2txyewy\SearchHost.exe N/A |
| 1 N/A N/A 16244 C+G ...US\ArmouryDevice\asus_framework.exe N/A |
| 1 N/A N/A 17392 C+G ...ekyb3d8bbwe\PhoneExperienceHost.exe N/A |
| 1 N/A N/A 18056 C+G ...e5b\Corsair iCUE5 Software\iCUE.exe N/A |
| 1 N/A N/A 18392 C+G ...GeForce Experience\NVIDIA Share.exe N/A |
| 1 N/A N/A 18632 C+G ...crosoft\Edge\Application\msedge.exe N/A |
| 1 N/A N/A 18844 C+G ...cal\Microsoft\OneDrive\OneDrive.exe N/A |
| 1 N/A N/A 21116 C+G ...on\123.0.2420.97\msedgewebview2.exe N/A |
| 1 N/A N/A 22288 C+G ....5435.0_x64__8j3eq9eme6ctt\IGCC.exe N/A |
| 1 N/A N/A 23536 C+G ...sair iCUE5 Software\QmlRenderer.exe N/A |
| 1 N/A N/A 24584 C+G ...siveControlPanel\SystemSettings.exe N/A |
| 1 N/A N/A 24816 C+G ...\Docker\frontend\Docker Desktop.exe N/A |
| 1 N/A N/A 25976 C C:\opt\miniconda3\python.exe N/A |
| 1 N/A N/A 27624 C+G C:\opt\vscode\Code.exe N/A |
michael@13900b MINGW64 /c/wse_github/obrienlabsdev/machine-learning/environments/windows/src/google-gemma (main)
$ python gemma-gpu.py
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████| 4/4 [00:06<00:00, 1.60s/it]
genarate start: 09:07:39
C:\Users\michael\AppData\Roaming\Python\Python311\site-packages\transformers\models\gemma\modeling_gemma.py:555: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
<bos>how is gold made in collapsing neutron stars - specifically what is the ratio created during the beta and r process.
Answer:
Step 1/2
First, when a neutron star collapses, it undergoes a process called gravitational collapse, which causes the star to rapidly lose mass and density. This process releases a tremendous amount of energy, which can cause the star to explode in a supernova. During the supernova, the star's core undergoes a process called the r-process, which is responsible for creating heavy elements like gold. The r-process occurs when neutrons are added to atomic nuclei, causing them to become unstable and undergo beta decay. This process continues until the nucleus reaches a stable state, which is usually a heavy element like gold.
Step 2/2
The ratio of gold created during the r-process is not well understood, as it depends on a variety of factors, including the mass and density of the star, the amount of energy released during the supernova, and the specific conditions of the r-process. However, it is believed that the r-process is responsible for creating most of the heavy elements in the universe, including gold.<eos>
end 09:09:22
(base)
Single NVIDIA A6000 - Ampere GA102 (see L40s equivalent on GCP) 12 seconds for 170 tokens = 14 tokens/sec 98% GPU utilization of 10k cores and 34GB/48GB VRAM @ 85% TDP 250W of 300W 0% PCIe bus interface load of 768 GB/s (384 bit)
CPU - 14900K - 6400MHz RAM - overclocked 89 seconds (7.4x A6000) = 1.9 tokens/sec 90% CPU utilization of 32 vCores (24+8) and 33GB/64GB RAM
Dual NVIDIA 4090 - Ada AD102 102 seconds (8.5x A6000) = 1.7 tokens/sec 70% GPU utilization of 2x 16k cores and 34GB/48GB VRAM @ 22% TDP 220W of 900W 60% PCIe bus interface load of 1008 GB/s (384 bit)
Dual NVIDIA A4500 - Ampere GA102 119 seconds (10x A6000) = 1.4 tokens/sec 75% GPU utilization of 2x 7k cores and 34GB/40GB VRAM @ 40% TDP 160W of 400W 75% PCIe bus interface load of 640 GB/s (320 bit)
CPU - 13900KS 147 seconds (12.3x A6000) = 1.2 tokens/sec 96% CPU utilization of 32 vCores (24+8) and 34GB/64GB RAM
CPU - 13900K 152 seconds (13x A6000) = 1.1 tokens/sec 98% CPU utilization of 32 vCores (24+8) and 35GB/192GB RAM
Dual L4 on GCP - Ampere AD104 202 seconds (17x A6000) = 0.85 tokens/sec 65% GPU utilization of 2x 7k cores and 35GB/46GB VRAM @ 50% TDP 70W of 150W ?% PCIe bus interface load of 300 GB/s (192 bit)
see #27 https://ai.google.dev/gemma/docs?hl=en https://www.kaggle.com/models/google/gemma
Gemma on Vertex AI Model garden https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335?_ga=2.34476193.-1036776313.1707424880&hl=en
https://obrienlabs.medium.com/google-gemma-7b-and-2b-llm-models-are-now-available-to-developers-as-oss-on-hugging-face-737f65688f0d
https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf https://blog.google/technology/developers/gemma-open-models/
https://huggingface.co/google/gemma-7b https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/l4/PB-11316-001_v01.pdf
pull and remake the latest llama.cpp (see previous article running llama 70b - https://github.com/ObrienlabsDev/machine-learning/issues/7
https://github.com/abetlen/llama-cpp-python/issues/1207 https://github.com/ggerganov/llama.cpp/commit/580111d42b3b6ad0a390bfb267d6e3077506eb31
7B (32G model needs 64G on a CPU or a RTX-A6000/RTX-5000 Ada) and 2B (on a macbook M1Max:32G unified ram - working perfectly
https://cloud.google.com/blog/products/ai-machine-learning/performance-deepdive-of-gemma-on-google-cloud