artshumrc / giza

JSON API (for TMS Database) and Django 2 application for Digital Giza
http://giza.fas.harvard.edu/
7 stars 5 forks source link

"old" production site URL still active #59

Closed npicardo closed 7 years ago

npicardo commented 7 years ago

A recent google search revealed that the "old" URL (giza-web.rc.fas.harvard.edu) is still active in addition to giza.fas.harvard.edu, so google generates results for both separately. Tested to see if they have the same target. Many searches produce the same results; however, a blank simple search (totals for all data) produce different totals listed for "search results found" at top of results page: 145,994 on old URL but 145,998 on the new. Faceted search category totals are identical via both URLs when broken down individually, but when added up they total 145,823.

rsinghal commented 7 years ago

giza-web is not the old production site - it's the dev site for testing purposes before pushing to production. I can make that site not indexable by Google, but they may have different numbers depending on when the last data refresh happened on each of the servers.

npicardo commented 7 years ago

Hi Rashmi,

Ugh, sorry for my brain lapse about that URL. If there’s value for you on the dev side to have it indexed by Google, I defer to your judgement on whether or not to make that change. If there’s no major upside, probably better to cancel the indexing to prevent user-side confusion and to know that our analytics for the public prototype site represent all traffic. Thanks for the explanation! Cheers, Nick

From: Rashmi Singhal [mailto:notifications@github.com] Sent: Friday, July 07, 2017 4:27 PM To: rsinghal/giza Cc: Picardo, Nicholas; Author Subject: Re: [rsinghal/giza] "old" production site URL still active (#59)

giza-web is not the old production site - it's the dev site for testing purposes before pushing to production. I can make that site not indexable by Google, but they may have different numbers depending on when the last data refresh happened on each of the servers.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rsinghal_giza_issues_59-23issuecomment-2D313784767&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=nb8TTutY49AaLb_fwOwl5tFyM60Ldi56ycV1qpKzmHA&s=ef8EWSE4fO9MO-A4AceilWkPkgtHBroB5b-iM_33K6o&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AXSHk6vBudipu-2D1Kl-5FgXpr2EDo1BbdFuks5sLpSLgaJpZM4OQIIA&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=nb8TTutY49AaLb_fwOwl5tFyM60Ldi56ycV1qpKzmHA&s=WFWenOEsncjeZVPNKwgqNYE1eTYet9J8G6w5IOZedrc&e=.

npicardo commented 7 years ago

Is it a concern that the displayed total items found for a search doesn’t match the combined faceted values for either of the site instances?

From: Rashmi Singhal [mailto:notifications@github.com] Sent: Friday, July 07, 2017 4:27 PM To: rsinghal/giza Cc: Picardo, Nicholas; Author Subject: Re: [rsinghal/giza] "old" production site URL still active (#59)

giza-web is not the old production site - it's the dev site for testing purposes before pushing to production. I can make that site not indexable by Google, but they may have different numbers depending on when the last data refresh happened on each of the servers.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rsinghal_giza_issues_59-23issuecomment-2D313784767&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=nb8TTutY49AaLb_fwOwl5tFyM60Ldi56ycV1qpKzmHA&s=ef8EWSE4fO9MO-A4AceilWkPkgtHBroB5b-iM_33K6o&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AXSHk6vBudipu-2D1Kl-5FgXpr2EDo1BbdFuks5sLpSLgaJpZM4OQIIA&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=nb8TTutY49AaLb_fwOwl5tFyM60Ldi56ycV1qpKzmHA&s=WFWenOEsncjeZVPNKwgqNYE1eTYet9J8G6w5IOZedrc&e=.

rsinghal commented 7 years ago

Agreed - no need for the dev site to be indexed. I will get that in place sometime this week. As for the search numbers and facets having different numbers, that could be a concern. I will make it a separate ticket for later investigation.

npicardo commented 7 years ago

Awesome, thanks!

From: Rashmi Singhal [mailto:notifications@github.com] Sent: Monday, July 10, 2017 4:04 PM To: rsinghal/giza Cc: Picardo, Nicholas; Author Subject: Re: [rsinghal/giza] "old" production site URL still active (#59)

Agreed - no need for the dev site to be indexed. I will get that in place sometime this week. As for the search numbers and facets having different numbers, that could be a concern. I will make it a separate ticket for later investigation.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rsinghal_giza_issues_59-23issuecomment-2D314219227&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=bvsDv7w8Ewxi3bawtV4nPZ6mOaDRr4Gsg1BMF0Wt6HI&s=Lu_GSpRoOSaj6OlLU81zhnWHtJC0zqZflkBkmIUbf2c&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AXSHk1frAR2M9bsSfflJHehHUwZdMPpUks5sMoOugaJpZM4OQIIA&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=bvsDv7w8Ewxi3bawtV4nPZ6mOaDRr4Gsg1BMF0Wt6HI&s=6v0Hti5nlKNVXuzIn8rjlLBSMV7tiqPi8onuLmgHqdY&e=.

rsinghal commented 7 years ago

robots.txt added to giza-web