Closed GoogleCodeExporter closed 8 years ago
Marty, can you vet this? It seems reasonable to me based on a quick scan of the
ticket, bug, and documentation.
On the other hand, comma is in the term break list. Mahesh, do you have any
concrete reason to remove the
comma, beyond it being unnecessary and not recommended?
Original comment by jl1615@gmail.com
on 30 Apr 2009 at 9:59
Seems reasonable to me. I'm not sure what you mean by 'term break list'?
Original comment by mar...@google.com
on 1 May 2009 at 12:01
Oh, you made me look it up! I mean the metatag_restrict_substring_separator
command line flag mentioned in
bug #231438.
Original comment by jl1615@gmail.com
on 1 May 2009 at 12:41
Yes, it seems given the current restricted separator list the comma would be
innocuous. It's not clear from the support ticket that the comma is causing any
problems since the bug and support discussion seem related to embedded '.' and
'&'
characters.
I would second John's question - do we have any specific case of the comma
causing a
support issue?
Original comment by mar...@google.com
on 1 May 2009 at 1:17
After further research on this issue, I conclude that using either the current
delimiter (", ") or the proposed new delimiter (" ") will cause issues.
Issue with current delimiter:
-----------------------------
Lets say "resources" field is a repeating attribute and this stores person
names in
the form of FIRST_NAME, LAST_NAME
For doc1, "resources" field contains only a single value.
Muhammad, Sheikh
Corresponding meta tag field in feed XML -
<meta name="resources" content="Muhammad, Sheikh"/>
For doc2, "resources" field contains two values.
Adam, Muhammad
Sheikh, Abdullah
Corresponding meta tag field in feed XML -
<meta name="resources" content="Adam, Muhammad, Sheikh, Abdullah"/>
With legacy CMS search tools, if you perform search for {resources=Muhammad,
Sheikh},
it returns only doc1.
But GSA returns both doc1 & doc2.
Moreover, comma is in the term break list of bug #231438.
Issue with proposed delimiter:
-----------------------------
Lets say "location" field is a repeating attribute and this stores location
names.
For doc3, "location" field contains only a single value.
Virginia Washington
Corresponding meta tag field in feed XML -
<meta name="location" content="Virginia Washington"/>
For doc4, "location" field contains two values.
West Virginia
Washington D.C.
Corresponding meta tag field in feed XML -
<meta name="location" content="West Virginia Washington D.C."/>
With legacy CMS search tools, if you perform search for {location=Virginia
Washington}, it returns only doc3.
But GSA returns both doc3 & doc4.
My 2 Cents-
Consider the last example (doc4) above.
What if the connector generates feed XML in the following fashion for handling
repeating attribute values?
<meta name="location" content="West Virginia"/>
<meta name="location" content="Washington D.C."/>
Any caveats by following this approach? Well, I leave it to you guys.
Original comment by lightbends
on 6 May 2009 at 10:08
Rather than affect our current customers leaving this as is for now.
Original comment by mgron...@gmail.com
on 6 May 2009 at 11:05
Original issue reported on code.google.com by
lightbends
on 30 Apr 2009 at 10:48