spolu commented 5 months ago

Given positive feedback so far on the action itself and associated documentation from at least 3 differents sources, it feels like Extract Data action is ready for ungating.

We'll be able to announce it in our next Product update + @albandum working on a blog post.

I don't believe it requires much more ceremony than that.

@Duncid @gabhubert thoughts?

Duncid commented 5 months ago

Imo before ungating we should work on an extra layer of UI fine tuning, in particular around the generation of the "schema" part.

gabhubert commented 5 months ago

Deferring to @Duncid on the polish to maintain a UI/UX bar on the product, but I think we do have some promising signal and ant to learn fast. Do we have the things you are thinking about framed somewhere?

spolu commented 5 months ago

@Duncid happy to continue work on it as soon as you have proposition. But also 100% people interested in it understood it without any support, so I would not block on it?

Duncid commented 5 months ago

Ok, here is the problem I think we're having more and more:

A lot of the product inovation is tech driven (which is 100% desirable in our case)
So we start with building
Then we have something functional and we want to test
- At this point, too early to invest a lot in the XP
Then we have positive feedback
- At this point, because it's functional and it brings value, why not launching

The problem is that product that are not designed tend to be fully the product of a technical perspective, in that sense, centered around technical consideration and not user considerations, Ugly, overly complicated, using engineering lingo…

So they need to be "re-invented". Some of that can be done over time, but:

Some is breaking, thus hard to do later
Later often means never and we're building debpt
It impacts the perception of the overall quality of the product
People testing are usually people interested in the feature. They will give feedback on the feature technically working and bringing the value but they can't give us feedback on the overall complexity created or the affordance of the feature to other users.
We need to be careful that the technical thinking does not prevent us from thinking product. We need to have a rough idea of where this is going mid / long term to optimise short term. So "in what piece of the puzzle does this go into".

IMO, guiding principles would be:

Build first, build fast
Validate with users
Ship limited but lasting
- Avoid one way door (things that will generate breaking changes later)

Practically in our case

High level problems

We know people already don't understand Most Recent. We're adding Extract.

The relation between Instructions and Actions is difficult to understand. Schema is deduced from Instructions, but can work independantly. Until now, people are used to describe the output in instructions, with examples of structure output.

I don't understand what's under the hood here, so maybe Schema is differant from a description of an output in Instructions, but semantically it's very much alike.

Thoughts

Essentially, the feature is Most Recent (systematic, chronological) + structured output (Schema).

It feels like Schema could actually work on pretty much any type of retrieval:

After a semantic search
After a web search
…

Schema as an option for "most recent"

One way to avoid creating a new action is to have it as an option of Most Recent. User would select "Most recent" (or whatever we decide to name it), and define a structure Schema (that would be optional, suggested based on instructions).

Schema as a part of Instructions

Pushing it one step further, I wonder if the feature here is not "Schema", something that user could use in Instructions as a more structured way to define an output.

Schema could be suggested depending on the instructions (Suggestions could work similarly to Copilot in VSCode?).

It could take the shape of an editing tool, like a "table" tool. .

In that world, a user can specify a structure for any type of answer, no new action is required.

Formating as a new user input

Pushing it one step further, we could add a separated field it Instruction for defining an output format. It would be optional, could be structured (schema) or not structured (examples, description of the type of output…).

Formatted Answer action

Would it make sense to have a separated "Most recent" and "Formatted answer" action? The action would be "Answer in a sutructred way".

Minimal Quality changes

IMO, we should consider the above before shipping. If we still want to ship fast and think later, here are the changes that I would make ahead of shipping that are quality relatyed:

Description

The assistant will process the data sources over the specified time frame and attempt to extract structured information based on the schema provided. This action can process up to 500k tokens (the equivalent of 1000 pages book). Learn more about this feature in the documentation.

change to:

This action scans all data sources within the specified time frame, extracting information based on a predefined schema. It can process the equivalent to a 1,000-page book (500k tokens). More in the documentation.

Schema

When the user enters "Action" and "Extract" is selected, or when user selects "Extract", we generate schema in background
Empty "Schema" looks like a "Select Data Source" empty block, with a big "Add fields"
Add fields open a menu, "Add empty field", a list of suggestted field (Name, Type, Description)

gabhubert commented 5 months ago

fwiw @Duncid I agree with you on the product dev level, and we're definitely leading with tech feasibility in this new frontier world. We can course correct over time as we reduce our confidence that the thing is good because it's possible.

Let's pause a bit here, and take your points to see if we can find a direction we'd be sad not to have thought about. We can timebox it to 2 working days (by our weekly).

On your comments

It feels like Schema could actually work on pretty much any type of retrieval:

After a semantic search After a web search …

I believe that's not the case, right. the semantic search outputs chunks first, which means you don't have confidence on being able to extract all results.

Schema as a part of Instructions Pushing it one step further, I wonder if the feature here is not "Schema", something that user could use in Instructions as a more structured way to define an output.

Schema could be suggested depending on the instructions (Suggestions could work similarly to Copilot in VSCode?).

It could take the shape of an editing tool, like a "table" tool.

I might be wrong, but I think that logically, schema comes before output, and so this isn't ok. Am I wrong?

gabhubert commented 5 months ago

One this I do find missing from the current proposal is that extract should work with reply only + contentFragment, but doesn't in the current format, right?

Duncid commented 5 months ago

I believe that's not the case, right. the semantic search outputs chunks first, which means you don't have confidence on being able to extract all results.

There is always the problem of "too much data" but that's also true in the current implementation (there may be too much data in 1 day)

As per IRL discussion:

One this I do find missing from the current proposal is that extract should work with reply only + contentFragment, but doesn't in the current format, right?

In a multi action world and a world where Schema is a separated action, that would maybe work.

spolu commented 5 months ago

I lean towards releasing as quickly as possible with fast follow work on "Minimal Quality changes" given the fact that everything else is a much broader product question related to the definition of assistants and the assistant builder.

Do you want to block on "Minimal Quality changes"? I think we should not esp if we tackle them rapidly?

Duncid commented 4 months ago

Do you want to block on "Minimal Quality changes"? I think we should not esp if we tackle them rapidly?

I don't want to block anything ^^ As it is it IMO don't pass the front page test. That's ok with me if we have a strong commitment to ship something that does in the coming days, but then is it really worth launching ahead?

I can work on that now.

spolu commented 4 months ago

As it is it IMO don't pass the front page test.

It definitely does. It's ok to release fast and a bit broken if usage is there. That definitely passes the front-page test :)

That's ok with me if we have a strong commitment to ship something that does in the coming days, but then is it really worth launching ahead?

Sure will make a pass on it. Main worry is that your proposal creates a somes questions:

What happens if there is no instructions?
What happens if one click before the suggestions were computed?

Duncid commented 4 months ago

It definitely does. It's ok to release fast and a bit broken if usage is there. That definitely passes the front-page test :)

Then we need to alig on what it means. As it is, it does not communicate good stuff about Dust. It looks complecated and technical, as well as not polished visually. So more or less "product for engineers" when we try to make AI accessible to team workers.

I would still be happy that we get a front page, but I would feel we're missing an opportunity to communicate the right message.

What happens if there is no instructions?

Add a field SUGGESTIONS No suggestions

What happens if one click before the suggestions were computed?

Add a field SUGGESTIONS {Spinner}

Working on UI :)

spolu commented 4 months ago

I quite dislike that UX. I think there is value in getting the full schema in one-go and there is value in retrying the generation since this is a model generating and the first try might not be the right one.

How about we keep the current interaction unchanged but apply some pure UI (non functional) improvements?

Duncid commented 4 months ago

I quite dislike that UX. I think there is value in getting the full schema in one-go and there is value in retrying the generation since this is a model generating and the first try might not be the right one.

One problem the current UX has is that it will erase all the fields when clicking generate. So if I edited my instructions and come back I either replace all or edit manually.

The proposed UX allows to generate once, edit instructions, add again and it will be enriched with new proposal that I can add one by one and not all or nothing.

How about we keep the current interaction unchanged but apply some pure UI (non functional) improvements?

As explained above, I think the current interaction is going to be frustrating for anyone who made manual edits. What we can do is add a Dialog warning that all existing schema will be erased, but modal are not great UI and forces to choose to keep current version or replace with an entirely new one.

Making proposals in both directions.

spolu commented 4 months ago

One problem the current UX has is that it will erase all the fields when clicking generate. So if I edited my instructions and come back I either replace all or edit manually.

I don't think that's a big problem since you have very little incentive to regenerate once you converge on something good.

Also you have to click Save to really override.

The proposed UX allows to generate once, edit instructions, add again and it will be enriched with new proposal that I can add one by one and not all or nothing.

Not really. It makes the process of getting a first proposal out super cumbersome. You have to click on a dropdown as many times as there is a field and I'm not sure you'll get that you're building a schema because of that.

As explained above, I think the current interaction is going to be frustrating for anyone who made manual edits.

I think edits don't require models. The model interaction is good to get a first version out and "understand" the concept of the schema.

What will be very painful with your proposal especially in the "edit" mode is that you'll get suggestions from the model as you come back that will be roughly the same as the ones you already have which will create clutter and noise and will probably confuse the user more than anything else?

What we can do is add a Dialog warning that all existing schema will be erased, but modal are not great UI and forces to choose to keep current version or replace with an entirely new one.

Happy to add that modal if the schema is not empty which I thiiink works great. If your intent is to explore what the button does, you'll cancel. And if your intent is to override then you'll likely be pleased by the safety.

spolu commented 4 months ago

I can add a UI that looks like emtpy data source section with the generate from instructions button in the middle.
Biggest question is where should the spinner be once you click?
I guess we can have the add field button only visible once a schema has been generated.

Duncid commented 4 months ago

UI

There you go @spolu take your pick!

Identical

Suggestions

Note: Would generate suggestions when entering Action on mono action, when opening the Action edition in multiaction.

An in-between (for the record—no strong beliefe in that direction)

Fud for now?

Types

Do we really need the "Type"? Could "type" be more user firendly (Text, Number, Yes/No in place of string / boolean)? Should we have more types (email, phone number…)?

Fud for later

The above addresses small micro-interaction level issue. But it feels like the tiny part of the iceberg, the big being that it's very hard to understand what the action does / how it works and the interconnection with instruction.

It's really not obvious that:

should go with:

To produce:

I push on this because I strongly believe in the value of the feature but I'm afraid as is it's only going to be touched by very few people (as even us have a hard time explain it internally). So I'm afraid we'll ship and forget, fail to get most of the value out of the work that's been done, or even overall have negative value created because of the added complexity.

spolu commented 4 months ago

Will go with 1 as a first step and let's keep our ears on the ground for feedback from users :+1:

Duncid commented 4 months ago

@spolu curious about the types question though.

spolu commented 4 months ago

Types are aligned with what models provide as an interface. We could remove them and consider everything as a string but that would prevent some potential future use cases with code interpretation. Because we're still exploring these interactions I would be in favor of keeping them here.

I really think no totally non-technical person will ever understand this action anyway. For others I think it's actually desirable.

Would love to get more feedback to make these decisions and hopefully we'll get some :+1:

dust-tt / dust

[Process Action] release #5207

Practically in our case

High level problems

Thoughts

Schema as an option for "most recent"

Schema as a part of Instructions

Formating as a new user input

Formatted Answer action

Minimal Quality changes

Description

Schema

On your comments

UI

Identical

Suggestions

An in-between (for the record—no strong beliefe in that direction)

Fud for now?

Types

Fud for later