Closed npicardo closed 7 years ago
giza-web is not the old production site - it's the dev site for testing purposes before pushing to production. I can make that site not indexable by Google, but they may have different numbers depending on when the last data refresh happened on each of the servers.
Hi Rashmi,
Ugh, sorry for my brain lapse about that URL. If there’s value for you on the dev side to have it indexed by Google, I defer to your judgement on whether or not to make that change. If there’s no major upside, probably better to cancel the indexing to prevent user-side confusion and to know that our analytics for the public prototype site represent all traffic. Thanks for the explanation! Cheers, Nick
From: Rashmi Singhal [mailto:notifications@github.com] Sent: Friday, July 07, 2017 4:27 PM To: rsinghal/giza Cc: Picardo, Nicholas; Author Subject: Re: [rsinghal/giza] "old" production site URL still active (#59)
giza-web is not the old production site - it's the dev site for testing purposes before pushing to production. I can make that site not indexable by Google, but they may have different numbers depending on when the last data refresh happened on each of the servers.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rsinghal_giza_issues_59-23issuecomment-2D313784767&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=nb8TTutY49AaLb_fwOwl5tFyM60Ldi56ycV1qpKzmHA&s=ef8EWSE4fO9MO-A4AceilWkPkgtHBroB5b-iM_33K6o&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AXSHk6vBudipu-2D1Kl-5FgXpr2EDo1BbdFuks5sLpSLgaJpZM4OQIIA&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=nb8TTutY49AaLb_fwOwl5tFyM60Ldi56ycV1qpKzmHA&s=WFWenOEsncjeZVPNKwgqNYE1eTYet9J8G6w5IOZedrc&e=.
Is it a concern that the displayed total items found for a search doesn’t match the combined faceted values for either of the site instances?
From: Rashmi Singhal [mailto:notifications@github.com] Sent: Friday, July 07, 2017 4:27 PM To: rsinghal/giza Cc: Picardo, Nicholas; Author Subject: Re: [rsinghal/giza] "old" production site URL still active (#59)
giza-web is not the old production site - it's the dev site for testing purposes before pushing to production. I can make that site not indexable by Google, but they may have different numbers depending on when the last data refresh happened on each of the servers.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rsinghal_giza_issues_59-23issuecomment-2D313784767&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=nb8TTutY49AaLb_fwOwl5tFyM60Ldi56ycV1qpKzmHA&s=ef8EWSE4fO9MO-A4AceilWkPkgtHBroB5b-iM_33K6o&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AXSHk6vBudipu-2D1Kl-5FgXpr2EDo1BbdFuks5sLpSLgaJpZM4OQIIA&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=nb8TTutY49AaLb_fwOwl5tFyM60Ldi56ycV1qpKzmHA&s=WFWenOEsncjeZVPNKwgqNYE1eTYet9J8G6w5IOZedrc&e=.
Agreed - no need for the dev site to be indexed. I will get that in place sometime this week. As for the search numbers and facets having different numbers, that could be a concern. I will make it a separate ticket for later investigation.
Awesome, thanks!
From: Rashmi Singhal [mailto:notifications@github.com] Sent: Monday, July 10, 2017 4:04 PM To: rsinghal/giza Cc: Picardo, Nicholas; Author Subject: Re: [rsinghal/giza] "old" production site URL still active (#59)
Agreed - no need for the dev site to be indexed. I will get that in place sometime this week. As for the search numbers and facets having different numbers, that could be a concern. I will make it a separate ticket for later investigation.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rsinghal_giza_issues_59-23issuecomment-2D314219227&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=bvsDv7w8Ewxi3bawtV4nPZ6mOaDRr4Gsg1BMF0Wt6HI&s=Lu_GSpRoOSaj6OlLU81zhnWHtJC0zqZflkBkmIUbf2c&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AXSHk1frAR2M9bsSfflJHehHUwZdMPpUks5sMoOugaJpZM4OQIIA&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=DBtE9VsxhV3LTRMIedD72g_uRb7yrPLspnBaM1zWljQ&m=bvsDv7w8Ewxi3bawtV4nPZ6mOaDRr4Gsg1BMF0Wt6HI&s=6v0Hti5nlKNVXuzIn8rjlLBSMV7tiqPi8onuLmgHqdY&e=.
robots.txt added to giza-web
A recent google search revealed that the "old" URL (giza-web.rc.fas.harvard.edu) is still active in addition to giza.fas.harvard.edu, so google generates results for both separately. Tested to see if they have the same target. Many searches produce the same results; however, a blank simple search (totals for all data) produce different totals listed for "search results found" at top of results page: 145,994 on old URL but 145,998 on the new. Faceted search category totals are identical via both URLs when broken down individually, but when added up they total 145,823.