pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.34k stars 484 forks source link

`meta-llama/Meta-Llama-3-8B-Instruct` generates gibberish for long prompts #179

Closed griff4692 closed 2 weeks ago

griff4692 commented 3 weeks ago

scripts/prepare.sh works fine and Llama3 8B Instruct works on short prompts (< 1k tokens) but when you pass it a long prompt it generates garbage.

... {Long input on the Guggenheim }
Question: Which is the largest number?
A) Frank Lloyd Wright's age in 1943.
B) The size of the collection at the Guggenheim.
C) The building number of the museum's first venue.
D) The number of sketches it took Frank Lloyd Wright to create the museum. Som at the G
15

 Care Care Care Care Care and the m care at the m ... and the m ... ... mason of the care the 88

 G (st
14 " G (re m... (G (the G (the m- the G, the 7 m- the 1

 G and the G and the m and the and the first 15 of the original
...

As a sanity check, I can confirm that Llama-2 7B chat works as expected.

I tried with and without the official chat template for Llama-3 and neither works.

Anyone had a similar issue?

yanboliang commented 2 weeks ago

@griff4692 Can you let me know the prompt length you are using? I just tried an example whose length is 1518, the output looks very reasonable.

VikParuchuri commented 2 weeks ago

The prompt below should reproduce the error.

My repro steps were:

  1. Clone gpt-fast
  2. install torch-nightly (cuda 12.1) plus flash_attn
  3. use ./scripts/prepare.sh with meta-llama/Meta-Llama-3-8B-Instruct
  4. python generate.py --checkpoint_path checkpoints/meta-llama/Meta-Llama-3-8B-Instruct/model.pth --prompt "$(cat prompt.txt)"

Prompt:

Carefully read the beginning of the Wikipedia page on the Guggenheim meseum. You will be asked to answer a question at the end.

# Introduction

The Solomon R. Guggenheim Museum, often referred to as The Guggenheim, is an art museum at 1071 Fifth Avenue between 88th and 89th Streets on the Upper East Side of Manhattan in New York City. It hosts a permanent collection of Impressionist, Post-Impressionist, early Modern, and contemporary art and also features special exhibitions throughout the year. It was established by the Solomon R. Guggenheim Foundation in 1939 as the Museum of Non-Objective Painting, under the guidance of its first director, Hilla von Rebay. The museum adopted its current name in 1952, three years after the death of its founder Solomon R. Guggenheim. It continues to be operated and owned by the Solomon R. Guggenheim Foundation.
The museum's building, a landmark work of 20th-century architecture designed by Frank Lloyd Wright, drew controversy for the unusual shape of its display spaces and took 15 years to design and build; it was completed in 1959. It consists of a six-story, bowl-shaped main gallery to the south, a four-story "monitor" to the north, and a ten-story annex to the northeast. A six-story helical ramp extends along the main gallery's perimeter, under a central ceiling skylight. The Thannhauser Collection is housed within the top three stories of the monitor, and there are additional galleries in the annex and a learning center in the basement. The museum building's design was controversial when it was completed but was widely praised afterward. The building underwent extensive renovations from 1990 to 1992, when the annex was built, and it was renovated again from 2005 to 2008.
The museum's collection has grown over the decades and is founded upon several important private collections, including those of Guggenheim, Karl Nierendorf, Katherine Sophie Dreier, Justin Thannhauser, Rebay, Giuseppe Panza, Robert Mapplethorpe, and the Bohen Foundation. The collection, which includes around 8,000 works as of 2022, is shared with sister museums in Bilbao, Spain, and Venice, Italy. In 2023, nearly 861,000 people visited the museum.

# History

## Early years and Hilla Rebay
Solomon R. Guggenheim, a member of a wealthy mining family, began collecting works of the old masters in the 1890s. In 1926, he met artist Hilla von Rebay, who introduced him to European avant-garde art, in particular abstract art that she felt had a spiritual and utopian aspect (non-objective art). Guggenheim completely changed his collecting strategy, turning to the work of Wassily Kandinsky, among others. He began to display his collection to the public at his apartment in the Plaza Hotel in New York City. Guggenheim and Rebay initially considered building a museum at Rockefeller Center in Manhattan. As the collection grew, Guggenheim established the Solomon R. Guggenheim Foundation, in 1937, to foster the appreciation of modern art.
The foundation's first venue, the Museum of Non-Objective Painting, opened in 1939, under Rebay's direction, at 24 East 54th Street in midtown Manhattan. Under her guidance, Guggenheim sought to include in the collection the most important examples of non-objective art by early modernists. He wanted to display the collection at the 1939 New York World's Fair in Queens, but Rebay advocated for a more permanent location in Manhattan. By the early 1940s, the foundation had accumulated such a large collection of avant-garde paintings that the need for a permanent museum was apparent, and Rebay wanted to establish it before Guggenheim died.

## Design process
In 1943, Rebay and Guggenheim wrote a letter to Frank Lloyd Wright asking him to design a structure to house and display the collection. Rebay thought the 76-year-old Wright was dead, but Guggenheim's wife Irene Rothschild Guggenheim knew better and suggested that Rebay contact him. Wright accepted the opportunity to experiment with his "organic" style in an urban setting, saying that he had never seen a museum that was "properly designed". He was hired to design the building in June 1943. He was to receive a 10 percent commission on the project, which was expected to cost at least $1 million. It took him 15 years, more than 700 sketches, and six sets of working drawings to create and complete the museum, after a series of difficulties and delays; the cost eventually doubled from the initial estimate.
Rebay envisioned a space that would facilitate a new way of seeing modern art. She wrote Wright that "each of these great masterpieces should be organized into space, and only you ... would test the possibilities to do so. ... I want a temple of spirit, a monument!" Critic Paul Goldberger later wrote that Wright's modernist building was a catalyst for change, making it "socially and culturally acceptable for an architect to design a highly expressive, intensely personal museum. In this sense almost every museum of our time is a child of the Guggenheim." The Guggenheim is the only museum Wright designed; its urban location required him to design it in a vertical rather than horizontal form, far different from his earlier, rural works. Since he was not licensed as an architect in New York, he relied on Arthur Cort Holden, of the architectural firm Holden, McLaughlin & Associates, to deal with New York City's Board of Standards and Appeals.
From 1943 to early 1944, Wright produced four differing designs. One had a hexagonal shape and level floors for the galleries, though all the others had circular schemes and used a ramp continuing around the building. In his notes, he indicated that he wanted a "well proportioned floor space from bottom to top—a wheel chair going around and up and down". His original concept was called an inverted "ziggurat", because it resembled the steep steps on the ziggurats built in ancient Mesopotamia. Several architecture professors have speculated that the helical ramp and glass dome of Giuseppe Momo's 1932 staircase at the Vatican Museums was an inspiration for Wright's ramp and atrium.

Question: Which is the largest number?
A) Frank Lloyd Wright's age in 1943.
B) The size of the collection at the Guggenheim.
C) The building number of the museum's first venue.
D) The number of sketches it took Frank Lloyd Wright to create the museum.

Output:


Sup care care care Care Care Care
200 of the 15

when the m and the mese... The G
1 in the mid 22 ______ the m... V (F
 9 "G

the G and the 17 of the G there, the "G the M
 G 
G
 G
199 and the m and the first care of the beginning 

taking the m
 G
 ... the beginning the beginning it ... 

 G  G and the beginning the and the public the and the the �Care the museum  Care "G
 Care for the  the macc the "in the beginning it to the  the  the again the 10 years the Th. the beginning the most "G ... at the g and the Es and the 1- the "m and the "at the A "the "at the "m at the G the and the and
yanboliang commented 2 weeks ago

I can reproduce it now, I have a prompt whose length is 2313, which failed. I'm looking at it.

VikParuchuri commented 2 weeks ago

I think this is the rope_theta value not being set properly. When I manually set it to 500000.0, then it seems to work. I can test to verify and PR.

yanboliang commented 2 weeks ago

closing as the fix has landed.