lcompilers / lpython

Python compiler
https://lpython.org/
Other
1.51k stars 164 forks source link

Compile and run ``picoGPT/gpt2.py`` #1511

Open czgdp1807 opened 1 year ago

czgdp1807 commented 1 year ago

Reference - https://github.com/jaymody/picoGPT/blob/main/gpt2.py

Will work on this after https://github.com/lfortran/lfortran/pull/1213 and some improvements in arr_slice.cpp.

czgdp1807 commented 1 year ago

We need implementation of reduction functions like, np.mean, np.var for compiling and executing the above code.

certik commented 1 year ago

Notice how they annotate the arrays with dimensions. So we should improve how arrays are annotated, so that this information is part of the argument type.

certik commented 1 year ago

Here is how to try it out:

git clone https://github.com/jaymody/picoGPT.git
cd picoGPT
mamba create -n pico python=3.9
conda activate pico

On Apple M1 apply this diff:

--- a/requirements.txt
+++ b/requirements.txt
@@ -1,6 +1,6 @@
 numpy==1.24.1 # used for the actual model code/weights
 regex==2017.4.5 # used by the bpe tokenizer
 requests==2.27.1 # used to download gpt-2 files from openai
-tensorflow==2.11.0 # used to load the gpt-2 weights from the open-ai tf checkpoint
+tensorflow-macos==2.11.0 # used to load the gpt-2 weights from the open-ai tf checkpoint
 tqdm==4.64.0 # progress bar to keep your sanity
 fire==0.5.0 # easy CLI creation

Then:

pip install -r requirements.txt
python gpt2.py "Alan Turing theorized that computers would one day become"

It should look like this:

$ python gpt2.py "Alan Turing theorized that computers would one day become"
generating: 100%|███████████████████████████████| 40/40 [00:13<00:00,  2.94it/s]
 the most powerful machines on the planet.

The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.

and it takes about 13 seconds.

The large model (about 6GB download) can be used as:

$ python gpt2.py "Who wrote the book the origin of species?" --model_size "1558M" 
generating: 100%|███████████████████████████████| 40/40 [02:28<00:00,  3.72s/it]

The book is called "The Origin of Species" by Charles Darwin.

What is the origin of species?

The origin of species is the process by which new species are created

and the 774M model (about 3GB download):

$ python gpt2.py "What is the capital of the Czech Republic?" --model_size "774M"
generating: 100%|███████████████████████████████| 40/40 [01:37<00:00,  2.44s/it]

The capital of the Czech Republic is Prague.

What is the capital of the Czech Republic?

The capital of the Czech Republic is Prague.

What is the capital of
certik commented 1 year ago

To compare against the standard https://huggingface.co/gpt2-large:

mamba install transformers

then:

In [1]: from transformers import pipeline, set_seed

In [2]: generator = pipeline('text-generation', model='gpt2-large')

In [3]: set_seed(42)

In [4]: %time generator("The capital of the Czech Republic is", max_length=40, n
   ...: um_return_sequences=1)
/Users/ondrej/mambaforge/envs/ml/lib/python3.11/site-packages/transformers/generation/utils.py:1186: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
  warnings.warn(
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
CPU times: user 53.2 s, sys: 790 ms, total: 54 s
Wall time: 7.21 s
Out[4]: [{'generated_text': 'The capital of the Czech Republic is Prague and the capital is Brno.\n\nThe region is separated from central and eastern Bohemia (the former name of the Czech Republic) by an area of'}]

In [5]: %time generator("The capital of the Czech Republic is", max_length=40, n
   ...: um_return_sequences=1)
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
CPU times: user 51.6 s, sys: 792 ms, total: 52.4 s
Wall time: 7.02 s
Out[5]: [{'generated_text': 'The capital of the Czech Republic is Prague and the main airport is Brno. Prague is a vibrant, lively city with a very diverse and colourful skyline. It is a cosmopolitan city with a wonderful'}]

It takes about 7s or so. (The answers contain incorrect facts, but I assume that's normal for GPT-2.)