Inconsistent behavior between Prompt Lab and SDK for 'greedy' flan-ul2 model.

danielfrees commented 1 year ago

Version Information

Package Version: 0.2.5
Operating System: macOS 13.4 (ARM M2 mac)

What is the expected behavior?

The same prompt should yield the same generated text.

What is the actual behavior?

Precisely the same prompts yield different generated text. In the prompt lab, my summary prompt summarizes the text nicely. In the SDK, the model just echoes my input text.

Please provide a unit test that demonstrates the bug.

My Prompt is shared below :

""" Summarize the paragraph, capturing meaningful dates, revenue, and numbers and shortening less important sentences.

Input:

Founded in 1990, Apollo is a high-growth, global alternative asset manager and a retirement services provider. Apollo conducts its business primarily in the United States through the following three reportable segments: Asset Management, Retirement Services and Principal Investing. These business segments are differentiated based on the investment services they provide as well as varying investing strategies. Our Businesses

Asset Management

Our Asset Management segment focuses on three investing strategies: yield, hybrid and equity. These strategies reflect the range of investment capabilities across our platform based on relative risk and return. As an asset manager, we earn fees for providing investment management services and expertise to our client base. The amount of fees charged for managing these assets depends on the underlying investment strategy, liquidity profile, and, ultimately our ability to generate returns for our clients. We also earn capital solutions fees as part of our growing capital solutions business and as part of monitoring and deployment activity alongside our sizable private equity franchise. After expenses, we call the resulting earning stream “Fee Related Earnings” or “FRE”, which represents the primary performance measure for the Asset Management segment. As of December 31, 2022, we had total AUM of $547.6 billion.

Our Asset Management segment had a team of 2,540 employees as of December 31, 2022, with offices throughout the world. This team possesses a broad range of transaction, financial, managerial and investment skills. We operate our asset management business in a highly integrated manner, which we believe distinguishes us from other alternative asset managers. Our investment teams frequently collaborate across disciplines and we believe that this collaboration enables our clients to more successfully invest across a company’s capital structure. Our objective is to achieve superior long-term risk-adjusted returns for our clients.

Output:

Apollo is a high-growth, global alternative asset manager and retirement services provider. Most of Apollo's business is conducted in the United States through the following three reportable segments: Asset Management, Retirement Services, and Principal Investing. Asset management focuses on investing with yield, hybrid, and equity strategies. Apollo earns fees for providing investment management services, and charging capital solutions fees. As of December 31, 2022, we had total AUM of $547.6 billion. The asset management team had 2,540 employees as of Dec 31, 2022.

Input:

Integra LifeSciences is a global leader in regenerative tissue technologies and neurological solutions dedicated to limiting uncertainty for clinicians so they can focus on providing the best patient care. Founded in 1989 with the acquisition of an engineered collagen technology platform used to repair and regenerate tissue, Integra LifeSciences Holdings Corporation common stock trades on the Nasdaq Global Select Market (“Nasdaq”) under the symbol “IART.” Integra has developed numerous product lines from this technology for applications ranging from burn and deep tissue wounds to the repair of dura mater in the brain, as well as nerves and tendons. The Company has expanded its base regenerative technology business to include surgical instruments, neurosurgical products and advanced wound care through global acquisitions and product development to meet the evolving needs of its customers and enhance patient care.

Output: """

_Generated output in the prompt lab, with Greedy, min_tokens = 20, and max_tokens = 300 selected:_

Integra LifeSciences is a global leader in regenerative tissue technologies and neurological solutions. Founded in 1989 with the acquisition of an engineered collagen technology platform used to repair and regenerate tissue. Integra has developed numerous product lines from this technology for applications ranging from burn and deep tissue wounds to the repair of dura mater in the brain, as well as nerves and tendons.

_Generated output in the SDK, with Greedy, min_new_tokens = 20, max_newtokens = 300, repetition penalty = 1..0 (it's exactly the same as the input)

Integra LifeSciences is a global leader in regenerative tissue technologies and neurological solutions dedicated to limiting uncertainty for clinicians so they can focus on providing the best patient care. Founded in 1989 with the acquisition of an engineered collagen technology platform used to repair and regenerate tissue, Integra LifeSciences Holdings Corporation common stock trades on the Nasdaq Global Select Market (Nasdaq) under the symbol IART. Integra has developed numerous product lines from this technology for applications ranging from burn and deep tissue wounds to the repair of dura mater in the brain as well as nerves and tendons. The Company has expanded its base regenerative technology business to include surgical instruments, neurosurgical products, and advanced wound care through global acquisitions and product development to meet the evolving needs of its customers and enhance patient care.

Other notes on how to reproduce the issue?

These are my params in the SDK:

params = GenerateParams(decoding_method='greedy' 
                            min_new_tokens=20,
                            max_new_tokens=300,
                            repetition_penalty=1.0)

Note that I verified formatting were the same between Python and the prompt lab. I used a file to read the prompt into Python, and then print it out to verify that all looked good.

Any possible solutions?

I don't know enough about the underlying API system in this SDK to know why this is going wrong, but my guess is that there is either a model mismatch between the Prompt Lab and here for the google/flan-ul2 model, or there is something wrong with parameter passing for 'greedy' methodology prompts.

Can you identify the location in the GENAI source code where the problem exists?

No

If the bug is confirmed, would you be willing to submit a PR?

Depends on the root cause

Tomas2D commented 1 year ago

Hello @danielfrees, note that when using UI, some parameters can be used under the hood automatically. To see what everything will be sent to the API (from the UI). Click on View curl command to see the exact HTTP body.

The next crucial thing is to escape the input, which is done automatically on the UI but not in the SDK.

Let me know.

danielfrees commented 1 year ago

Hi Tomas, the only params I'm noticing that might be different are under curl command > parameters > moderations. Any explanation on what this "hap" parameter is?

I've added re.escape(prompt) when submitting prompts via the SDK and it does not seem to make a difference.

"parameters": {
    "decoding_method": "greedy",
    "min_new_tokens": 20,
    "max_new_tokens": 300,
    "moderations": {
      "hap": {
        "input": false,
        "threshold": 0.75,
        "output": false
      }
    }
  }

danielfrees commented 1 year ago

Can't seem to find where that would be passed in the GenerateParams code (or any of the other schemas for that matter). Lmk if this is something I can manage via the SDK!

Tomas2D commented 1 year ago

You can pass moderations to the GenerateParams class. There is just no moderations property explicitly mentioned, but it would work. But the moderations parameter will not in any way edit the output.

Tomas2D commented 1 year ago

Here is my minimal code that generates the same output as UI does.

import os

import dotenv
from genai import Model, Credentials
from genai.schemas import GenerateParams

dotenv.load_dotenv()

prompt = """Summarize the paragraph, capturing meaningful dates, revenue, and numbers and shortening less important sentences.

Input:

Founded in 1990, Apollo is a high-growth, global alternative asset manager and a retirement services provider. Apollo conducts its business primarily in the United States through the following three reportable segments: Asset Management, Retirement Services and Principal Investing. These business segments are differentiated based on the investment services they provide as well as varying investing strategies.
Our Businesses

Asset Management

Our Asset Management segment focuses on three investing strategies: yield, hybrid and equity. These strategies reflect the range of investment capabilities across our platform based on relative risk and return. As an asset manager, we earn fees for providing investment management services and expertise to our client base. The amount of fees charged for managing these assets depends on the underlying investment strategy, liquidity profile, and, ultimately our ability to generate returns for our clients. We also earn capital solutions fees as part of our growing capital solutions business and as part of monitoring and deployment activity alongside our sizable private equity franchise. After expenses, we call the resulting earning stream “Fee Related Earnings” or “FRE”, which represents the primary performance measure for the Asset Management segment. As of December 31, 2022, we had total AUM of $547.6 billion.

Our Asset Management segment had a team of 2,540 employees as of December 31, 2022, with offices throughout the world. This team possesses a broad range of transaction, financial, managerial and investment skills. We operate our asset management business in a highly integrated manner, which we believe distinguishes us from other alternative asset managers. Our investment teams frequently collaborate across disciplines and we believe that this collaboration enables our clients to more successfully invest across a company’s capital structure. Our objective is to achieve superior long-term risk-adjusted returns for our clients.

Output:

Apollo is a high-growth, global alternative asset manager and retirement services provider. Most of Apollo's business is conducted in the United States through the following three reportable segments: Asset Management, Retirement Services, and Principal Investing. Asset management focuses on investing with yield, hybrid, and equity strategies. Apollo earns fees for providing investment management services, and charging capital solutions fees. As of December 31, 2022, we had total AUM of $547.6 billion. The asset management team had 2,540 employees as of Dec 31, 2022.

Input:

Integra LifeSciences is a global leader in regenerative tissue technologies and neurological solutions dedicated to limiting uncertainty for clinicians so they can focus on providing the best patient care. Founded in 1989 with the acquisition of an engineered collagen technology platform used to repair and regenerate tissue, Integra LifeSciences Holdings Corporation common stock trades on the Nasdaq Global Select Market (“Nasdaq”) under the symbol “IART.” Integra has developed numerous product lines from this technology for applications ranging from burn and deep tissue wounds to the repair of dura mater in the brain, as well as nerves and tendons. The Company has expanded its base regenerative technology business to include surgical instruments, neurosurgical products and advanced wound care through global acquisitions and product development to meet the evolving needs of its customers and enhance patient care.

Output:"""

model = Model(
    model="google/flan-ul2",
    params=GenerateParams(
        decoding_method="greedy",
        min_new_tokens=20,
        max_new_tokens=300,
        repetition_penalty=1.0,
    ),
    credentials=Credentials(
        api_key=os.getenv("GENAI_KEY"), api_endpoint=os.getenv("GENAI_ENDPOINT")
    ),
)

responses = model.generate(prompts=[prompt])

print(responses[0].generated_text)
# Integra LifeSciences is a global leader in regenerative tissue technologies and neurological solutions. Founded in 1989 with the acquisition of an engineered collagen technology platform used to repair and regenerate tissue. Integra has developed numerous product lines from this technology for applications ranging from burn and deep tissue wounds to the repair of dura mater in the brain, as well as nerves and tendons.

danielfrees commented 1 year ago

I tested your code and also got the correct output, which bewildered me since I set up the exact same GenerateParams, model, and prompt in the code that was giving me trouble the last few days. I then went and tested re-running my original code and it works now (also giving me the correct answer)...

Has anything changed re: the watsonx.ai servers or API recently?

I guess all is well that ends well, but wish I knew what was causing all the original strange behavior!

IBM / ibm-generative-ai