Currently the main site has a bunch of scraped data from reddit. And it's being cached by google which isn't good. The data scraped from reddit was only ever meant for testing purposes - definitely wouldn't wanna have it on main site when/if going public.
But having all the test data is really helpful in testing out the site thoroughly, both during development and probably for potential users. Since in the beginning the site won't receive as much activity, and the site being heavily data oriented (users/followers/tags/votes all interconnected) there wouldn't be anything to test.
So the plan is to keep two versions of the site - one main one, with an empty slate, and a "test" one with the reddit scraped data which should not show up on google and probably have clearly visible disclaimers that it uses scraped data from reddit.
The main site should have a link to the test site and vice versa.
[x] Create separate subdomain for test site (test.bublol.com)
[x] have it not crawlable by search engines (/robots.txt)
[ ] Mention test site on main site
[ ] Mention disclaimer on test site about reddit scraped data
Currently the main site has a bunch of scraped data from reddit. And it's being cached by google which isn't good. The data scraped from reddit was only ever meant for testing purposes - definitely wouldn't wanna have it on main site when/if going public.
But having all the test data is really helpful in testing out the site thoroughly, both during development and probably for potential users. Since in the beginning the site won't receive as much activity, and the site being heavily data oriented (users/followers/tags/votes all interconnected) there wouldn't be anything to test.
So the plan is to keep two versions of the site - one main one, with an empty slate, and a "test" one with the reddit scraped data which should not show up on google and probably have clearly visible disclaimers that it uses scraped data from reddit.
The main site should have a link to the test site and vice versa.