Closed ipeirotis closed 11 years ago
Clarification:
When we clear out to step 3 we have one result (row) --> (sid, stitle, tid, ttitle) where "s" stands for source and "t'' for target. A target page is a primary page, so we know witch is the redirect (source) and witch is the non-redirect (target).
Is it necessary to perform the 3b, 3c check?
3b will be useful in cases where a redirect term leads to another redirect term. Not sure if this is ever the case. If this is never the case, then we just perform 3c.
I think we concluded that mediawiki self-prevents the redirect-to-redirect relations. Also when we construct the relation table I take precautions for that matter -(I think!?).
So I'll skip 3b for good since there's no need for it at all.
Taking that mediawiki doesn't allow redirect-to-redirect relation I skip all the clauses that check if page is/is not a redirect.
So I rewrite my initial queries (Case sensitive or not) by doing a GROUP BY tid and that fetching ONLY base pages. Then I do the count check and disambiguation check (according to algorithm) and now I think our results are more user friendly and conclusive.
You can check it in wikisyno/?action=ajax_v2term=YOUR TERM
Some case studies for you are: Ajax, ajax, aJaX, c plus plus, c pound, settlement.
I also have added the 2 CS columns and I'm going to test speed now --> pending update!!!
Update and Indexing of the CS columns in page_relation table have improved speed significantly!
Issue case-insensitive query
1a. If 1 entry returned, keep it, proceed to step 3 1b. If n>1 entries returned 1b.a If only one entry is a non-redirect, keep it, proceed to step 3 1b.b If more than one non-redirects, proceed to Step 2 1c. If no matching entries, return a 204 HTTP code (No content) and a correcsponding message
At this step, we have a single candidate term to use.
3a. If the term is a disambiguation page, return 300 HTTP code (multiple choices) and the appropriate json with the entries that appear in the disambiguation page and a warning message: "The entry is a disambiguation page in Wikipedia. Please query again with one of the returned terms."
3b. If the term is a redirect, then replace term with the redirect term, and repeat Step 3
3c. If the term is not a redirect, find all the redirect terms that lead to it, and return the terms that redirect to it as synonyms. Return the term as the canonical form in the JSON