[RFC] Stage 0: Introducing LLM fields

susan-shu-c commented 2 months ago

Overview

This RFC proposes LLM fields, with the increase of Generative AI and LLM logging. This will benefit our customers and users, allowing them to monitor and protect their LLM/Generative AI deployments.

Update 1: Mika and I have kicked off conversation with the OTel LLM working group; and will proceed there first. We will supersede this PR with an OTel PR; then loop back to this ECS PR again.

Update 2: Opened OTel discussion issue here

Checklist

Have you signed the contributor license agreement?
Have you followed the contributor guidelines?
For proposing substantial changes or additions to the schema, have you reviewed the RFC process?
If submitting code/script changes, have you verified all tests pass locally using make test?
If submitting schema/fields updates, have you generated new artifacts by running make and committed those changes?
Is your pull request against main? Unless there is a good reason otherwise, we prefer pull requests against main and will backport as needed.
Have you added an entry to the CHANGELOG.next.md?

mjwolf commented 2 months ago

Hi @susan-shu-c, with Elastic's donation of ECS to Open Telemetry, we're going to start adding new features to Open Telemetry first, and then backport changes into ECS. You can find more details about this here.

There's an Open Telemetry LLM working group which has already added some OTel schemas for LLMs. Could you consider getting involved with the OTel working group to add these fields?

peasead commented 2 months ago

@susan-shu-c I'd like to be a part of the discussions. I see that they're using a genai.* fieldset and I think we may want to have discussions with OTel about llm.* being a top-level field, with genai.* being nested under, along with other categories.

trisch-me commented 2 months ago

@peasead FYI Otel currently has no option to "embed" namespaces into each other like we do in ECS. But it is planned to be implemented ASAP.

Currently it will mean that there will be direct dot notation in namespace, llm.gen_ai.*

susan-shu-c commented 2 months ago

There's an Open Telemetry LLM working group which has already added some OTel schemas for LLMs. Could you consider getting involved with the OTel working group to add these fields?

Will participate in some WG meetings; the next on on May 1.(shoutout to @trisch-me for advising). Will edit this PR accordingly

peasead commented 2 months ago

@susan-shu-c I'd like to be a part of the discussions. I see that they're using a genai.* fieldset and I think we may want to have discussions with OTel about llm.* being a top-level field, with genai.* being nested under, along with other categories.

Just to close the loop on this, I misunderstood the AI hierarchy. My original comment was incorrect and should be disregarded.

susan-shu-c commented 2 months ago

Update: Mika and I have kicked off conversation with the OTel LLM working group; and will proceed there first.

susan-shu-c commented 2 months ago

Update 2: Opened OTel discussion issue here

trisch-me commented 2 months ago

I agree with the comment in the otel issue that for such larger submissions the better way is to split it into subgroups. There will be most probably long debates about naming and definitions so it's much easier for small submissions to get in faster. Also if you see these things are happening - don't be afraid to split submission again and extract those fields into separate PR/issue. Sometimes we are adding just 1 field if it's controversial enough to get into hot discussions

peasead commented 1 month ago

Hey @susan-shu-c is there anything I can do to help around gen_ai.threat.* fields?

I saw that one of the responses was to break the areas into different issues and PRs. I know that these can be intense, so if there is anything I can help with regarding the threat fieldset, please let me know.

elastic / ecs

[RFC] Stage 0: Introducing LLM fields #2337

Overview

Checklist