vectara / vectara-docs

Documentation for Vectara's GenAI Platform
https://docs.vectara.com
Apache License 2.0
8 stars 13 forks source link

Metadata structuring is not handled in the same way as the docs for API v2 #357

Open abhishekpradhan opened 18 hours ago

abhishekpradhan commented 18 hours ago

Pages

https://docs.vectara.com/docs/api-reference/search-apis/interpreting-responses/metadata

Description

The way in which metadata responses are created ensures that pure numbers will be returned as numbers, the same thing with booleans and JSON.

Here's the issue:

      "part_metadata": {
        "speaker": "Deep Thought",
        "lang": "eng",
        "section": "2",
        "offset": "316"
      },
      "document_metadata": {
        "author": "Douglas Adams",
        "publicationyear": "1979"
      },

will actually be

      "part_metadata": {
        "speaker": "Deep Thought",
        "lang": "eng",
        "section": 2,
        "offset": 316
      },
      "document_metadata": {
        "author": "Douglas Adams",
        "publicationyear": 1979
      },

It looks like further down in the same document this is addressed correctly.

I would also recommend adding this as part of https://docs.vectara.com/docs/migration-guide-api-v2 as it was handled differently in v1.

abhishekpradhan commented 17 hours ago

Follow up we got here was what this affects, it's only when we query documents. So uploading documents using v2 does not result in the conversion of the metadata. (and this is conversion is only limited to the metadata)

abhishekpradhan commented 16 hours ago
Numbers `-?(?:0 [1-9]\d*)(?:\.\d+)?(?:[eE][+-]?\d+)?` Input Matches? Explanation
123 Valid integer.
0 Valid zero.
-456 Valid negative integer.
3.14 Valid decimal number.
-0.001 Valid negative decimal.
2e10 Valid scientific notation.
-1.23E-4 Valid negative number in scientific notation.
.5 Invalid (missing leading integer).
1e Invalid (missing exponent value).
1.2.3 Invalid (multiple decimal points).
- Invalid (missing digits).
abhishekpradhan commented 16 hours ago

Booleans ^(true|false)$

Input Matches? Explanation
true Exact match for true.
false Exact match for false.
true Invalid (leading space).
false Invalid (trailing space).
True Invalid (case-sensitive; must be lowercase).
TRUE Invalid (case-sensitive; must be lowercase).
falsey Invalid (extra characters after false).
truest Invalid (extra characters after true).
tru Invalid (partial match; incomplete true).
abhishekpradhan commented 16 hours ago

JSON ^[{|\[].*$

Input Matches? Explanation
{example} Starts with { and has additional content.
[data] Starts with [ and has additional content.
|pipe Starts with | and has additional content.
{ Matches a single { at the start.
| Matches a single | at the start.
[ Matches a single [ at the start.
example Does not start with {, |, or [.
something{ Starts with s, not {, |, or [.
`` (empty) Empty string does not match.