microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime
MIT License
415 stars 94 forks source link

GPU driver error when using AMD eGPU via DirectML #644

Open x0wllaar opened 2 months ago

x0wllaar commented 2 months ago

Hello!

I've encountered a weird error when trying to use the genai runtime on my setup. I know that my setup is unconventional to say the least and that it may not be supported, but I'd still appreciate any help with debugging the issue.

If it's a fundamental limitation, I think it might be good to have a good error message and add it to the docs.

Thank you!

My setup:

CPU: i7-1360P RAM: 64GB GPU: Radeon RX 6700 XT, connected via Thunderbolt 3 (using Sonnet Breakaway box) VRAM: 12GB dedicated (on GPU) + 32GB shared (as reported by Task Manager) GPU driver version: 32.0.11002.2000

The problem:

For prompts that are long enough, the compute_logits() function will throw the following error:

RuntimeError: D:\a\_work\1\onnxruntime-genai\src\dml\dml_command_recorder.cpp(143)\onnxruntime_genai.cp312-win_amd64.pyd!00007FF8CA1FE073: (caller: 00007FF8CA1F1F75) Exception(1) tid(24bc) 887A0006 The GPU will not respond to more commands, most likely because of an invalid command passed by the calling application.

At the same time, the AMD software will report that a GPU drive timeout happened.

This corresponds with high VRAM use, but in these situations, not the entire VRAM is used, only around 10/12GB.

I suspect that what happens is that at this point, something starts pushing data into system RAM, but then the Thunderbolt-induced latency means that something times out as the data starts being shuffled around between the system RAM and the VRAM.

I have reproduced this issue both with an AWQ quantized LLaMA 3 that I converted to ONNX, and with the official microsoft/Phi-3-mini-128k-instruct-onnx model.

Reproduction:

  1. Have a setup like mine, I guess
  2. Download the DirectML version of microsoft/Phi-3-mini-128k-instruct-onnx: huggingface-cli download microsoft/Phi-3-mini-128k-instruct-onnx --include directml/directml-int4-awq-block-128/* --local-dir .\test-models-phi
  3. Try to do inference on a prompt that is 'long enough' (typically, 1536 tokens will suffice)

Expected behavior: the model does inference, with a performance drop when there's not enough VRAM.

Observed behavior: see above

Reproduction code

This code (sorry for including the long text there):

#Download with huggingface-cli download microsoft/Phi-3-mini-128k-instruct-onnx --include directml/directml-int4-awq-block-128/* --local-dir .\test-models-phi
import onnxruntime_genai as og

model = og.Model('test-models-phi/directml/directml-int4-awq-block-128')
tokenizer = og.Tokenizer(model)
tokenizer_stream = tokenizer.create_stream()

# Set the max length to something sensible by default,
# since otherwise it will be set to the entire context length
search_options = {}
search_options['max_length'] = 8192

chat_template = '<|user|>\nSummarize this text for me: {input} <|end|>\n<|assistant|>'

#History of MS from Wikipedia
text = """
In late 1974, Paul Allen, a programmer at Honeywell, was walking through Harvard Square when he saw the cover of the January 1975 issue of Popular Electronics that demonstrated the Altair 8800, the first microcomputer.[7][8] Allen bought the magazine and rushed to Currier House at Harvard College, where he showed it to high school friend Bill Gates.[8] They saw potential to develop an implementation of BASIC for the system.[9]

Gates called Altair manufacturer Micro Instrumentation and Telemetry Systems (MITS), offering to demonstrate the implementation. Allen and Gates had neither an interpreter nor an Altair system, yet in the eight weeks before the demo, they developed an interpreter with the help of Monte Davidoff. When Allen flew to Albuquerque to meet with MITS, the interpreter worked and MITS agreed to distribute Altair BASIC.[10][8] Allen moved to Albuquerque, Gates soon quit Harvard to join him, and they co-founded Microsoft there.[8] Revenues of the company totalled $16,005 by the end of 1976.

Microsoft staff in Albuquerque, December 7, 1978

Top: Steve Wood, Bob Wallace, Jim Lane
Middle: Bob O'Rear, Bob Greenberg, Marc McDonald, Gordon Letwin
Bottom: Bill Gates, Andrea Lewis, Marla Wood, Paul Allen
Not pictured: Ric Weiland, Miriam Lubow[11]
Allen came up with the original name of Micro-Soft, a portmanteau of microcomputer and software.[12] Hyphenated in its early incarnations, on November 26, 1976, the company was registered under that name with the Secretary of State of New Mexico. The first employee Gates and Allen hired was their high school collaborator Ric Weiland.[10] The company's first international office was founded on November 1, 1978, in Japan, entitled "ASCII Microsoft" (now called "Microsoft Japan"), and on November 29, 1979, the term, "Microsoft" was first used by Bill Gates.[7] On January 1, 1979, the company moved from Albuquerque to a new home in Bellevue, Washington,[7] since it was hard to recruit top programmers to Albuquerque. Shortly before the move, 11 of the then-13 employees posed for the staff photo on the right.[13]

Steve Ballmer joined the company on June 11, 1980, and would later succeed Bill Gates as CEO[7] from January 2000 until February 2014. The company restructured on June 25, 1981, to become an incorporated business in its home state of Washington (with a further change of its name to "Microsoft Corporation, Inc."). As part of the restructuring, Bill Gates became president of the company and chairman of the board, and Paul Allen became executive vice president and vice chairman.[7] In 1983, Allen left the company after receiving a Hodgkin lymphoma diagnosis, though he remained on the board as vice-chairman. This effectively ended the formal business partnership between Gates and Allen, which had been strained months prior due to a contentious dispute over Microsoft equity.[14] Later in the decade, Gates and Allen repaired their relationship and together the two donated millions to their childhood school Lakeside.[8] They remained friends until Allen's death in October 2018.[15]

Microsoft's early products were different variants of Microsoft BASIC which was the dominant programming language in late 1970s and early 1980s home computers such as Apple II (Applesoft BASIC) and Commodore 64 (Commodore BASIC), and were also provided with early versions of the IBM PC as the IBM Cassette BASIC.

Microsoft also marketed through an Apple dealer in West Palm Beach, Florida two products for the Radio-Shack TRS-80. One was "Typing Tutor" which led the user through learning to use a keyboard. The other was authored by a professor at the University of Hawaii called "MuMATH" and had the ability to do mathematics in long integer math to avoid floating point numbers.

The Z-80 SoftCard, released in 1980
The first hardware product[16] was the Z-80 SoftCard which enabled the Apple II to run the CP/M operating system, at the time an industry-standard operating system for running business software and many compilers and interpreters for several high-level languages on microcomputers. The SoftCard was first demonstrated publicly at the West Coast Computer Faire in March 1980.[17][18] It was an immediate success; 5,000 cards, a large number given the microcomputer market at the time, were purchased in the initial three months at $349 (~$1,084 in 2023) each and it was Microsoft's number one revenue source in 1980.[19]

The first operating system publicly released by the company was a variant of Unix announced on August 25, 1980. Acquired from AT&T through a distribution license, Microsoft dubbed it Xenix, and hired Santa Cruz Operation in order to port/adapt the operating system to several platforms.[20][21] This Unix variant would become home to the first version of Microsoft's word processor, Microsoft Word. Originally titled "Multi-Tool Word", Microsoft Word became notable for its use of "What You See Is What You Get", or WYSIWYG pioneered by the Xerox Alto and the Bravo text editor in the 1970s.[22][23]

Word was first released in the spring of 1983, and free demonstration copies of the application were bundled with the November 1983 issue of PC World, making it one of the first programs to be distributed on-disk with a magazine. (Earlier magazine on-disk distributions included Robert Uiterwyk's BASIC in the May 1977 issue of Information Age.)[24][25] However, Xenix was never sold to end users directly although it was licensed to many software OEMs for resale. It grew to become the most popular version of Unix, measured by the number of machines running it[26] (note that Unix is a multi-user operating system, allowing simultaneous access to a machine by several users). By the mid-1980s Microsoft had gotten out of the Unix business, except for its ownership stake in SCO.[20]

A typical PC DOS command line
IBM first approached Gates and Allen about Microsoft's upcoming IBM Personal Computer (IBM PC) in July 1980, shortly after Gates's mother began working on United Way's executive board with IBM CEO John Opel.[25][27] On August 12, 1981, after negotiations with Digital Research failed, IBM awarded a contract to Microsoft to provide a version of the CP/M operating system, which was set to be used in the IBM PC. For this deal, Microsoft purchased a CP/M clone called 86-DOS from Tim Paterson of Seattle Computer Products for less than US$100,000, which IBM renamed to IBM PC DOS. The original CP/M was made by Gary Kildall of Digital Research, Inc. Due to potential copyright infringement problems with CP/M, IBM marketed both CP/M and PC DOS for US$240 and US$40, respectively, with PC DOS eventually becoming the standard because of its lower price.[28][29] Thirty-five of the company's 100 employees worked on the IBM project for more than a year. When the IBM PC debuted, Microsoft was the only company that offered operating system, programming language, and application software for the new computer.[27]

InfoWorld stated in 1984 that Microsoft, with $55 million (~$137 million in 2023) in 1983 sales,[30]

is widely recognized as the most influential company in the microcomputer-software industry. Claiming more than a million installed MS-DOS machines, founder and chairman Bill Gates has decided to certify Microsoft's jump on the rest of the industry by dominating applications, operating systems, peripherals and, most recently, book publishing. Some insiders say Microsoft is attempting to be the IBM of the software industry.

A 1982 ad for MS-DOS
In 1983, in collaboration with numerous companies, Microsoft created a home computer system, MSX, which contained its own version of the DOS operating system, called MSX-DOS; this became relatively popular in Japan, Europe and South America.[10][31][32] Later, the market saw a flood of IBM PC clones after Columbia Data Products successfully cloned the IBM BIOS, quickly followed by Eagle Computer and Compaq.[33][34][35][36] The deal with IBM allowed Microsoft to have control of its own QDOS derivative, MS-DOS, and through aggressive marketing of the operating system to manufacturers of IBM-PC clones, Microsoft rose from a small player to one of the major software vendors in the home computer industry.[37] With the release of the Microsoft Mouse on May 2, 1983, Microsoft continued to expand its product line in other markets. This expansion included Microsoft Press, a book publishing division, on July 11 the same year, which debuted with two titles: Exploring the IBM PCjr Home Computer by Peter Norton, and The Apple Macintosh Book by Cary Lu.[7]

Ireland became home to one of Microsoft's international production facilities in 1985, and on November 20 Microsoft released its first retail version of Microsoft Windows (Windows 1.0), originally a graphical extension for its MS-DOS operating system.[7] In August, Microsoft and IBM partnered in the development of a different operating system called OS/2. OS/2 was marketed in connection with a new hardware design proprietary to IBM, the PS/2.[39]

On February 16, 1986, Microsoft relocated their headquarters to a corporate office campus in Redmond, Washington. Around one month later, on March 13, the company went public with an IPO, raising US$61 million at US$21.00 per share. By the end of the trading day, the price had risen to US$28.00. In 1987, Microsoft eventually released their first version of OS/2 to OEMs.[40] By then the company was the world's largest producer of software for personal computers—ahead of former leader Lotus Development—and published the three most-popular Macintosh business applications.[41] That year the company purchased Forethought, the developer of PowerPoint and Microsoft's first major software acquisition on the 30th July 1987.[42]

A 1986 ad for Microsoft Windows 1.0
Meanwhile, Microsoft began introducing its most prominent office products. Microsoft Works, an integrated office program which combined features typically found in a word processor, spreadsheet, database and other office applications, saw its first release as an application for the Apple Macintosh towards the end of 1986.[10] Microsoft Works would later be sold with other Microsoft products including Microsoft Word and Microsoft Bookshelf, a reference collection introduced in 1987 that was the company's first CD-ROM product.[7][43] Later, on August 8, 1989, Microsoft introduced its most successful office product, Microsoft Office. Unlike the model of Microsoft Works, Microsoft Office was a bundle of separate office productivity applications, such as Microsoft Word, Microsoft Excel and so forth. While Microsoft Word and Microsoft Office were mostly developed internally, Microsoft also continued its trend of rebranding products from other companies, such as Microsoft SQL Server on January 13, 1988, a relational database management system for companies that was based on technology licensed from Sybase.[7]

On May 22, 1990, Microsoft launched Windows 3.0.[10] The new version of Microsoft's operating system boasted new features such as streamlined graphic user interface GUI and improved protected mode ability for the Intel 386 processor; it sold over 100,000 copies in two weeks.[10][44] Windows at the time generated more revenue for Microsoft than OS/2, and the company decided to move more resources from OS/2 to Windows.[45] In an internal memo to Microsoft employees on May 16, 1991, Bill Gates announced that the OS/2 partnership was over, and that Microsoft would henceforth focus its platform efforts on Windows and the Windows NT kernel. Some people, especially developers who had ignored Windows and committed most of their resources to OS/2, were taken by surprise, and accused Microsoft of deception. This changeover from OS/2 was frequently referred to in the industry as "the head-fake".[46][47] In the recent years, the popularity of OS/2 declined, and Windows quickly became the favored PC platform. 1991 also marked the founding of Microsoft Research, an organization in Microsoft for researching computer science subjects, and Microsoft Visual Basic, a popular development product for companies and individuals.[7]

The Microsoft sign at the entrance of the German Microsoft campus, Konrad-Zuse-Str. 1, Unterschleißheim
During the transition from MS-DOS to Windows, the success of Microsoft's product Microsoft Office allowed the company to gain ground on application-software competitors, such as WordPerfect and Lotus 1-2-3.[10][48] Novell, an owner of WordPerfect for a time, alleged that Microsoft used its inside knowledge of the DOS and Windows kernels and of undocumented Application Programming Interface features to make Office perform better than its competitors.[49] Eventually, Microsoft Office became the dominant business suite, with a market share far exceeding that of its competitors.[50] In March 1992, Microsoft released Windows 3.1 along with its first promotional campaign on TV; the software sold over three million copies in its first two months on the market.[7][10] In October, Windows for Workgroups 3.1 was released with integrated networking abilities such as peer-to-peer file and printing sharing.[10] In November, Microsoft released the first version of their popular database software Microsoft Access.[10]

Microsoft sign at entrance of Dubai Microsoft campus, Dubai Internet City. Microsoft developed Arabic versions for most of its products.
By 1993, Windows had become the most widely used GUI operating system in the world.[10] Fortune Magazine named Microsoft as the "1993 Most Innovative Company Operating in the U.S."[51] The year also marked the end of a five-year copyright infringement legal case brought by Apple, dubbed Apple Computer, Inc. v. Microsoft Corp., in which the ruling was in Microsoft's favor. Microsoft also released Windows for Workgroups 3.11, a new version of the consumer line of Windows, and Windows NT 3.1, a server-based operating system with a similar user interface to consumer versions of the operating system, but with an entirely different kernel.[10] As part of its strategy to broaden its business, Microsoft released Microsoft Encarta on March 22, 1993, the first encyclopedia designed to run on a computer.[7] Soon after, the Microsoft Home brand was introduced – encompassing Microsoft's new multimedia applications for Windows 3.x., Microsoft changed its slogan to "Where do you want to go today?" in 1994 as part of an attempt to appeal to nontechnical audiences in a US$100 million (~$187 million in 2023) advertising campaign.[10]
"""

prompt = f'{chat_template.format(input=text)}'

input_tokens = tokenizer.encode(prompt)

print("Total tokens:", len(input_tokens))

actual_n_tokens = 128
tokens_step = 128

while actual_n_tokens < len(input_tokens):
    print("Using", actual_n_tokens, "tokens")
    actual_tokens = input_tokens[:actual_n_tokens]

    params = og.GeneratorParams(model)
    params.set_search_options(**search_options)
    params.input_ids = actual_tokens
    generator = og.Generator(model, params)

    while not generator.is_done():
        generator.compute_logits()
        generator.generate_next_token()

        new_token = generator.get_next_tokens()[0]

    print("Done")
    del generator
    actual_n_tokens += tokens_step

Outputs this:

Total tokens: 3745
Using 128 tokens
Done
Using 256 tokens
Done
Using 384 tokens
Done
Using 512 tokens
Done
Using 640 tokens
Done
Using 768 tokens
Done
Using 896 tokens
Done
Using 1024 tokens
Done
Using 1152 tokens
2024-06-25 16:38:25.1780618 [E:onnxruntime:onnxruntime-genai, sequential_executor.cc:516 onnxruntime::ExecuteKernel] Non-zero status code returned while running DmlFusedNode_0_0 node. Name:'DmlFusedNode_0_0' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\DmlGraphFusionHelper.cpp(1066)\onnxruntime.dll!00007FF8C8CDA2E1: (caller: 00007FF8C8D69109) Exception(2) tid(1838) 887A0006 The GPU will not respond to more commands, most likely because of an invalid command passed by the calling application.

Traceback (most recent call last):
  File "C:\Users\Gregory\test\mdl\local_dml\phi3-test.py", line 93, in <module>
    generator.compute_logits()
onnxruntime_genai.onnxruntime_genai.OrtException: Non-zero status code returned while running DmlFusedNode_0_0 node. Name:'DmlFusedNode_0_0' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\DmlGraphFusionHelper.cpp(1066)\onnxruntime.dll!00007FF8C8CDA2E1: (caller: 00007FF8C8D69109) Exception(2) tid(1838) 887A0006 The GPU will not respond to more commands, most likely because of an invalid command passed by the calling application.
zhangxiang1993 commented 1 month ago

Hi, thank you for the detailed report, the long prompt failure and performance drop are two major sticky bugs we are trying to fix recently. We are still looking into it, thanks for the patience and try-out.