Closed pypt closed 6 years ago
Thanks for starting this!
I'm just going to comment and let you edit the master list. Below is the list of everything you have marked for removal:
SQL schema stuff:
And here's more stuff I think we can get rid of:
Thanks, so the list of stuff to remove for the pass 1 of the cleanup (split up into separate "consumable" issues to avoid scope creep):
-server
: done in a0beda079doc/repo-map.markdown
: done in f1ab01c0fdoc/story_index_migration.markdown
: done in 8d932c1a5doc/tutorial.markdown
: done in 79e969c24CONTRIBUTING.md
: done in 8e8a6dbd5scripts/
: #412collection2
: #413tools/benchmark/
: done in 3f54aedd6tools/graph/layout_with_fa2.py
: done in 6da669360data/cache/
: none of the subdirectories were under source control, but cleaned them up on mccore1db_row_last_updated
: #414download.types
: #415Closing this for now.
Our spring codebase clean-up didn't quite happen due to story indexing / partitioning / other stuff, so I've tried to compile a list of stuff that we could potentially remove.
Purpose:
Guidelines:
HEAD
to Git's history.Every item that I think could be considered for removal is marked with an
[x]
(and thus GitHub made it into a checkbox). To make it easier to search the list in a browser, I've additionally marked every proposed item with (farewell!) string.Would love some comments @hroberts!
Features to remove
Individual files to remove
.python-version
.travis-after_success.sh
.travis-before_install.sh
.travis-install.sh
.travis-lxd
.travis-lxd/config.inc.sh
.travis-lxd/setup_lxd.inc.sh
.travis-lxd/setup_lxd_container.inc.sh
.travis-lxd/setup_travis_lxd_image.sh
.travis-script.sh
.travis.yml
CONTRIBUTING.md
(farewell!)INSTALL.markdown
LICENSE
README.markdown
_Inline
_Inline-webapp
ansible
ansible/ansible.cfg
ansible/deploy.retry
ansible/deploy.yml
ansible/group_vars
ansible/group_vars/all.yml
ansible/group_vars/ungrouped.yml
ansible/host_vars
ansible/host_vars/localhost.yml
ansible/inventory
ansible/inventory/hosts.sample.yml
ansible/inventory/hosts.yml
ansible/mediacloud.retry
ansible/pre-tasks.yml
ansible/roles
ansible/roles/apache2-fcgi
ansible/roles/apache2-fcgi/handlers
ansible/roles/apache2-fcgi/handlers/main.yml
ansible/roles/apache2-fcgi/tasks
ansible/roles/apache2-fcgi/tasks/main-MacOSX.yml
ansible/roles/apache2-fcgi/tasks/main-Ubuntu.yml
ansible/roles/apache2-fcgi/tasks/main.yml
ansible/roles/apache2-fcgi/templates
ansible/roles/apache2-fcgi/templates/001-mediacloud.conf.j2
ansible/roles/apache2-fcgi/vars
ansible/roles/apache2-fcgi/vars/main-MacOSX.yml
ansible/roles/apache2-fcgi/vars/main-Ubuntu.yml
ansible/roles/apache2-fcgi/vars/main.yml
ansible/roles/common
ansible/roles/common/tasks
ansible/roles/common/tasks/main-MacOSX.yml
ansible/roles/common/tasks/main-Ubuntu.yml
ansible/roles/common/tasks/main.yml
ansible/roles/crontab
ansible/roles/crontab/tasks
ansible/roles/crontab/tasks/main.yml
(farewell!)import_solr_data
Cron job.mediawords_import_solr_data.pl
script has been deprecated and disabled. Who's doing the Solr importing then? I can't find anything on Crontab nor Supervisor.ansible/roles/crontab/vars
ansible/roles/crontab/vars/main.yml
ansible/roles/deploy
ansible/roles/deploy/tasks
ansible/roles/deploy/tasks/main.yml
ansible/roles/git-hooks
ansible/roles/git-hooks/tasks
ansible/roles/git-hooks/tasks/main.yml
ansible/roles/git-hooks/vars
ansible/roles/git-hooks/vars/main.yml
ansible/roles/git-repository
ansible/roles/git-repository/tasks
ansible/roles/git-repository/tasks/main.yml
ansible/roles/hostname
ansible/roles/hostname/handlers
ansible/roles/hostname/handlers/main.yml
ansible/roles/hostname/tasks
ansible/roles/hostname/tasks/main-MacOSX.yml
ansible/roles/hostname/tasks/main-Ubuntu.yml
ansible/roles/hostname/tasks/main.yml
ansible/roles/locale
ansible/roles/locale/tasks
ansible/roles/locale/tasks/main-MacOSX.yml
ansible/roles/locale/tasks/main-Ubuntu.yml
ansible/roles/locale/tasks/main.yml
ansible/roles/locale/vars
ansible/roles/locale/vars/main-MacOSX.yml
ansible/roles/locale/vars/main-Ubuntu.yml
ansible/roles/locale/vars/main.yml
ansible/roles/mediawords-yml
ansible/roles/mediawords-yml/tasks
ansible/roles/mediawords-yml/tasks/main.yml
ansible/roles/pam-limits
ansible/roles/pam-limits/tasks
ansible/roles/pam-limits/tasks/main-MacOSX.yml
ansible/roles/pam-limits/tasks/main-Ubuntu.yml
ansible/roles/pam-limits/tasks/main.yml
ansible/roles/pam-limits/vars
ansible/roles/pam-limits/vars/main-MacOSX.yml
ansible/roles/pam-limits/vars/main-Ubuntu.yml
ansible/roles/pam-limits/vars/main.yml
ansible/roles/perl-dependencies
ansible/roles/perl-dependencies/tasks
ansible/roles/perl-dependencies/tasks/main.yml
ansible/roles/perl-dependencies/vars
ansible/roles/perl-dependencies/vars/main-MacOSX.yml
ansible/roles/perl-dependencies/vars/main-Ubuntu.yml
ansible/roles/perl-dependencies/vars/main.yml
ansible/roles/perlbrew
ansible/roles/perlbrew/tasks
ansible/roles/perlbrew/tasks/main.yml
ansible/roles/perlbrew/vars
ansible/roles/perlbrew/vars/main.yml
ansible/roles/postgresql-server
ansible/roles/postgresql-server/handlers
ansible/roles/postgresql-server/handlers/main.yml
ansible/roles/postgresql-server/tasks
ansible/roles/postgresql-server/tasks/install-MacOSX.yml
ansible/roles/postgresql-server/tasks/install-Ubuntu.yml
ansible/roles/postgresql-server/tasks/main.yml
ansible/roles/postgresql-server/tasks/start-MacOSX.yml
ansible/roles/postgresql-server/tasks/start-Ubuntu.yml
ansible/roles/postgresql-server/templates
ansible/roles/postgresql-server/templates/01-mediacloud.conf.j2
ansible/roles/postgresql-server/vars
ansible/roles/postgresql-server/vars/main-MacOSX.yml
ansible/roles/postgresql-server/vars/main-Ubuntu.yml
ansible/roles/postgresql-server/vars/main.yml
ansible/roles/python-dependencies
ansible/roles/python-dependencies/files
ansible/roles/python-dependencies/files/requirements.txt
ansible/roles/python-dependencies/tasks
ansible/roles/python-dependencies/tasks/main.yml
ansible/roles/system-packages
ansible/roles/system-packages/tasks
ansible/roles/system-packages/tasks/main-MacOSX.yml
ansible/roles/system-packages/tasks/main-Ubuntu.yml
ansible/roles/system-packages/tasks/main.yml
ansible/roles/system-packages/vars
ansible/roles/system-packages/vars/main-MacOSX.yml
ansible/roles/system-packages/vars/main-Ubuntu.yml
ansible/roles/system-packages/vars/main.yml
ansible/roles/timezone
ansible/roles/timezone/handlers
ansible/roles/timezone/handlers/main.yml
ansible/roles/timezone/tasks
ansible/roles/timezone/tasks/main-MacOSX.yml
ansible/roles/timezone/tasks/main-Ubuntu.yml
ansible/roles/timezone/tasks/main.yml
ansible/roles/timezone/vars
ansible/roles/timezone/vars/main-MacOSX.yml
ansible/roles/timezone/vars/main-Ubuntu.yml
ansible/roles/timezone/vars/main.yml
ansible/roles/update-packages
ansible/roles/update-packages/tasks
ansible/roles/update-packages/tasks/main-MacOSX.yml
ansible/roles/update-packages/tasks/main-Ubuntu.yml
ansible/roles/update-packages/tasks/main.yml
ansible/roles/user
ansible/roles/user/tasks
ansible/roles/user/tasks/main.yml
ansible/setup.retry
ansible/setup.yml
ansible/travis.yml
app.psgi
data
data/cache
data/cache/sentence_field_counts
data/cache/solr_import_file_errors
data/cache/solr_import_file_pos
data/cache/word_counts
data/logs
data/rabbitmq
data/rabbitmq/advanced.config
data/rabbitmq/enabled_plugins
data/rabbitmq/generated_config
data/rabbitmq/generated_config/generated
data/rabbitmq/generated_config/generated/rabbitmq.config
data/rabbitmq/logs
data/rabbitmq/logs/log
data/rabbitmq/mnesia
data/rabbitmq/mnesia/mediacloud@localhost
data/rabbitmq/rabbitmq.conf
data/rabbitmq/schema
data/rabbitmq/schema/rabbit.schema
data/rabbitmq/schema/rabbitmq_amqp1_0.schema
data/rabbitmq/schema/rabbitmq_management.schema
data/solr
data/solr/dist
data/solr_dumps
data/solr_dumps/dumps
data/supervisor_logs
doc
doc/ansible.markdown
doc/api_2_0_spec
doc/api_2_0_spec/admin_api_2_0_spec.md
doc/api_2_0_spec/api_2_0_spec.md
doc/api_2_0_spec/github-markdown-toc
doc/api_2_0_spec/topics_api_2_0_spec.md
doc/api_2_0_spec/update_toc.sh
doc/auth.markdown
doc/backup_crawler.markdown
(farewell!)-systems
?doc/coding_guidelines
doc/coding_guidelines.markdown
doc/coding_guidelines/pycharm-no-warnings.png
doc/coding_guidelines/pycharm-suppress-warning.png
doc/coding_guidelines/pycharm-warning.png
doc/crawler.markdown
doc/db_migrations.markdown
doc/diagrams
(farewell!)doc/diagrams/story_processing_flow.pdf
(farewell!)doc/diagrams/story_processing_flow.xml
(farewell!)doc/extractor.markdown
doc/facebook_api.markdown
doc/job_manager.markdown
doc/logging.markdown
doc/perl_to_python_rewrite.mdown
doc/perlbrew.markdown
doc/repo-map.markdown
(farewell!)doc/solr.markdown
doc/story_index_migration.markdown
(farewell!)doc/story_processing_flows.markdown
doc/supervisor.markdown
doc/test_suite.markdown
doc/topic_mining.markdown
doc/topic_mining.mermaid
doc/topic_mining.png
doc/topic_snapshots.markdown
doc/topics.markdown
doc/tutorial.markdown
(farewell!)doc/vagrant.markdown
doc/validate
(farewell!)doc/validate/feedly_import/README.markdown
are just dead). I'd say let's move them to our blog or whatever.doc/validate/README.markdown
(farewell!)doc/validate/date_guess_threshold
(farewell!)doc/validate/date_guess_threshold/README.markdown
(farewell!)doc/validate/date_guess_threshold/date_guess_threshold_validation.csv
(farewell!)doc/validate/feedly_import
(farewell!)doc/validate/feedly_import/README.markdown
(farewell!)doc/validate/feedly_import/feedly_feeds.ods
(farewell!)doc/validate/feedly_import/feedly_stories.ods
(farewell!)doc/validate/topic_re
(farewell!)doc/validate/topic_re/topic_regexes.txt
(farewell!)install.sh
lib
lib/Catalyst
lib/Catalyst/Action
lib/Catalyst/Action/MC_REST.pm
lib/Catalyst/Authentication
lib/Catalyst/Authentication/Credential
lib/Catalyst/Authentication/Credential/MediaWords
lib/Catalyst/Authentication/Credential/MediaWords/APIKey.pm
lib/Catalyst/Authentication/Credential/MediaWords/UsernamePassword.pm
lib/Catalyst/Authentication/Store
lib/Catalyst/Authentication/Store/MediaWords.pm
lib/Catalyst/Plugin
lib/Catalyst/Plugin/ConfigDefaults.pm
lib/Devel
lib/Devel/Cover
lib/Devel/Cover/Report
lib/Devel/Cover/Report/CoverallsJSON.pm
lib/HTML
lib/HTML/FormFu
lib/HTML/FormFu/Constraint
lib/HTML/FormFu/Constraint/FeedURL.pm
lib/HTML/FormFu/Constraint/TagPrefix.pm
lib/HTML/FormFu/Constraint/URL.pm
lib/HTML/FormFu/OutputProcessor
lib/HTML/FormFu/OutputProcessor/RemoveEndForm.pm
lib/HTML/FormFu/Unicode.pm
lib/HTML/FormFu/Validator
lib/HTML/FormFu/Validator/FeedIsUnique.pm
lib/HTML/FormFu/Validator/MediumNameIsUnique.pm
lib/HTML/FormFu/Validator/MediumUrlIsUnique.pm
lib/MediaWords
lib/MediaWords.pm
lib/MediaWords/AbstractJob.pm
lib/MediaWords/ActionRole
lib/MediaWords/ActionRole/AbstractAuthenticatedActionRole.pm
lib/MediaWords/ActionRole/AdminAuthenticated.pm
lib/MediaWords/ActionRole/AdminReadAuthenticated.pm
lib/MediaWords/ActionRole/Logged.pm
lib/MediaWords/ActionRole/MediaEditAuthenticated.pm
lib/MediaWords/ActionRole/PublicApiKeyAuthenticated.pm
lib/MediaWords/ActionRole/RoleAuthenticated.pm
lib/MediaWords/ActionRole/StoriesEditAuthenticated.pm
lib/MediaWords/ActionRole/Throttled.pm
lib/MediaWords/ActionRole/TopicsAdminAuthenticated.pm
lib/MediaWords/ActionRole/TopicsReadAuthenticated.pm
lib/MediaWords/ActionRole/TopicsWriteAuthenticated.pm
lib/MediaWords/CommonLibs.pm
lib/MediaWords/Controller
lib/MediaWords/Controller/Admin
lib/MediaWords/Controller/Admin/CM.pm
(farewell!)lib/MediaWords/Controller/Admin/Downloads.pm
lib/MediaWords/Controller/Admin/Feeds.pm
lib/MediaWords/Controller/Admin/Health.pm
lib/MediaWords/Controller/Admin/Media
lib/MediaWords/Controller/Admin/Media.pm
lib/MediaWords/Controller/Admin/Media/Moderate.pm
(farewell!)lib/MediaWords/Controller/Admin/Profile.pm
lib/MediaWords/Controller/Admin/Stop_Server.pm
lib/MediaWords/Controller/Admin/Stories.pm
lib/MediaWords/Controller/Admin/TM.pm
lib/MediaWords/Controller/Admin/TagSets.pm
lib/MediaWords/Controller/Admin/Tags.pm
lib/MediaWords/Controller/Admin/Users.pm
lib/MediaWords/Controller/Api
lib/MediaWords/Controller/Api/V2
lib/MediaWords/Controller/Api/V2/Auth.pm
lib/MediaWords/Controller/Api/V2/Controversies.pm
(farewell!)lib/MediaWords/Controller/Api/V2/Controversy_Dump_Time_Slices.pm
lib/MediaWords/Controller/Api/V2/Controversy_Dumps.pm
lib/MediaWords/Controller/Api/V2/Downloads.pm
lib/MediaWords/Controller/Api/V2/Feeds.pm
lib/MediaWords/Controller/Api/V2/MC_Controller_REST.pm
lib/MediaWords/Controller/Api/V2/MC_REST_SimpleObject.pm
lib/MediaWords/Controller/Api/V2/Media.pm
submit_suggestion()
and related code.media_suggestions
table, I see only three valid suggestions. Given that suggestions require manual moderation, and given a very low volume, maybe the new media could just simply be suggested to us by email?lib/MediaWords/Controller/Api/V2/MediaHealth.pm
lib/MediaWords/Controller/Api/V2/Sentences.pm
lib/MediaWords/Controller/Api/V2/Stats.pm
lib/MediaWords/Controller/Api/V2/Stories.pm
word_matrix()
and related code.lib/MediaWords/Controller/Api/V2/StoriesBase.pm
lib/MediaWords/Controller/Api/V2/Stories_Public.pm
lib/MediaWords/Controller/Api/V2/Tag_Sets.pm
lib/MediaWords/Controller/Api/V2/Tags.pm
lib/MediaWords/Controller/Api/V2/Topics
lib/MediaWords/Controller/Api/V2/Topics.pm
lib/MediaWords/Controller/Api/V2/Topics/Focal_Set_Definitions.pm
(farewell!)lib/MediaWords/Controller/Api/V2/Topics/Focal_Sets.pm
(farewell!)lib/MediaWords/Controller/Api/V2/Topics/Foci.pm
(farewell!)lib/MediaWords/Controller/Api/V2/Topics/Focus_Definitions.pm
(farewell!)foci
,focal_sets
, ...), mostly resembling what's already in thetopics
table, so I'm not sure if it serves any use. Last time that code has been touched one year ago.lib/MediaWords/Controller/Api/V2/Topics/Media.pm
lib/MediaWords/Controller/Api/V2/Topics/Permissions.pm
lib/MediaWords/Controller/Api/V2/Topics/Sentences.pm
lib/MediaWords/Controller/Api/V2/Topics/Snapshots.pm
lib/MediaWords/Controller/Api/V2/Topics/Stories.pm
lib/MediaWords/Controller/Api/V2/Topics/Timespans.pm
lib/MediaWords/Controller/Api/V2/Topics/Wc.pm
lib/MediaWords/Controller/Api/V2/Util.pm
is_syndicated_ap()
and related code.lib/MediaWords/Controller/Api/V2/Wc.pm
lib/MediaWords/Controller/Login.pm
lib/MediaWords/Controller/Logout.pm
lib/MediaWords/Controller/Root.pm
lib/MediaWords/Controller/Search.pm
lib/MediaWords/Controller/Status.pm
lib/MediaWords/Crawler
lib/MediaWords/Crawler/Download
lib/MediaWords/Crawler/Download/Content.pm
lib/MediaWords/Crawler/Download/DefaultFetcher.pm
lib/MediaWords/Crawler/Download/DefaultHandler.pm
lib/MediaWords/Crawler/Download/Feed
lib/MediaWords/Crawler/Download/Feed/FeedHandler.pm
lib/MediaWords/Crawler/Download/Feed/Superglue.pm
(farewell!)363
lib/MediaWords/Crawler/Download/Feed/Syndicated.pm
lib/MediaWords/Crawler/Download/Feed/Univision.pm
lib/MediaWords/Crawler/Download/Feed/WebPage.pm
lib/MediaWords/Crawler/Downloads_Queue.pm
lib/MediaWords/Crawler/Engine.pm
lib/MediaWords/Crawler/FetcherRole.pm
lib/MediaWords/Crawler/HandlerRole.pm
lib/MediaWords/Crawler/Provider.pm
lib/MediaWords/DB
lib/MediaWords/DB.pm
lib/MediaWords/DB/HandlerProxy.pm
lib/MediaWords/DB/Locks.pm
lib/MediaWords/DB/Schema
lib/MediaWords/DB/Schema.pm
lib/MediaWords/DB/Schema/Version.pm
lib/MediaWords/DBI
lib/MediaWords/DBI/Activities.pm
(farewell!)activities
table.lib/MediaWords/DBI/ApiLinks.pm
lib/MediaWords/DBI/Auth
lib/MediaWords/DBI/Auth.pm
lib/MediaWords/DBI/Auth/ChangePassword.pm
lib/MediaWords/DBI/Auth/Limits.pm
lib/MediaWords/DBI/Auth/Login.pm
lib/MediaWords/DBI/Auth/Password.pm
lib/MediaWords/DBI/Auth/Profile.pm
lib/MediaWords/DBI/Auth/Register.pm
lib/MediaWords/DBI/Auth/ResetPassword.pm
lib/MediaWords/DBI/Auth/Roles
lib/MediaWords/DBI/Auth/Roles.pm
lib/MediaWords/DBI/Auth/Roles/List.pm
lib/MediaWords/DBI/Auth/User
lib/MediaWords/DBI/Auth/User/AbstractUser.pm
lib/MediaWords/DBI/Auth/User/CurrentUser
lib/MediaWords/DBI/Auth/User/CurrentUser.pm
lib/MediaWords/DBI/Auth/User/CurrentUser/APIKey.pm
lib/MediaWords/DBI/Auth/User/CurrentUser/Role.pm
lib/MediaWords/DBI/Auth/User/ModifyUser.pm
lib/MediaWords/DBI/Auth/User/NewOrModifyUser.pm
lib/MediaWords/DBI/Auth/User/NewUser.pm
subscribe_to_newsletter
fieldauth_users_subscribe_to_newsletter
table is empty, so either Rahul does it on their side and just sends us aFalse
, or no one actually ever subscribed to it. Either way, this could go.lib/MediaWords/DBI/DownloadTexts.pm
lib/MediaWords/DBI/Downloads.pm
lib/MediaWords/DBI/Feeds.pm
lib/MediaWords/DBI/Media
lib/MediaWords/DBI/Media.pm
lib/MediaWords/DBI/Media/Health.pm
lib/MediaWords/DBI/Media/Lookup.pm
lib/MediaWords/DBI/Media/PrimaryLanguage.pm
lib/MediaWords/DBI/Media/Rescrape.pm
(farewell!)lib/MediaWords/DBI/Media/SubjectCountry.pm
lib/MediaWords/DBI/Stats.pm
lib/MediaWords/DBI/Stories
lib/MediaWords/DBI/Stories.pm
lib/MediaWords/DBI/Stories/AP.pm
lib/MediaWords/DBI/Stories/ExtractorArguments.pm
lib/MediaWords/DBI/Stories/ExtractorVersion.pm
lib/MediaWords/DBI/Stories/GuessDate.pm
lib/MediaWords/Feed
lib/MediaWords/Feed/Parse
lib/MediaWords/Feed/Parse.pm
lib/MediaWords/Feed/Parse/SyndicatedFeed.pm
lib/MediaWords/Feed/Scrape
lib/MediaWords/Feed/Scrape.pm
lib/MediaWords/ImportStories
lib/MediaWords/ImportStories.pm
lib/MediaWords/ImportStories/ArchiveOrgTVCaptions.pm
(farewell!)lib/MediaWords/ImportStories/Feedly.pm
(farewell!)lib/MediaWords/ImportStories/ScrapeHTML.pm
(farewell!)lib/MediaWords/Job
lib/MediaWords/Job/CLIFF
lib/MediaWords/Job/CLIFF/FetchAnnotation.pm
lib/MediaWords/Job/CLIFF/UpdateStoryTags.pm
lib/MediaWords/Job/ExtractAndVector.pm
lib/MediaWords/Job/Facebook
lib/MediaWords/Job/Facebook/FetchStoryStats.pm
lib/MediaWords/Job/GenerateRetweeterScores.pm
(farewell!)lib/MediaWords/Job/ImportFeedlyStories.pm
(farewell!)lib/MediaWords/Job/NYTLabels
lib/MediaWords/Job/NYTLabels/FetchAnnotation.pm
lib/MediaWords/Job/NYTLabels/UpdateStoryTags.pm
lib/MediaWords/Job/RescrapeMedia.pm
lib/MediaWords/Job/TM
lib/MediaWords/Job/TM/ExtractStoryLinks.pm
lib/MediaWords/Job/TM/FetchLink.pm
lib/MediaWords/Job/TM/MineTopic.pm
lib/MediaWords/Job/TM/MineTopicPublic.pm
lib/MediaWords/Job/TM/SnapshotTopic.pm
lib/MediaWords/Job/Word2vec
lib/MediaWords/Job/Word2vec/GenerateSnapshotModel.pm
lib/MediaWords/KeyValueStore
lib/MediaWords/KeyValueStore.pm
lib/MediaWords/KeyValueStore/AmazonS3.pm
lib/MediaWords/KeyValueStore/CachedAmazonS3.pm
lib/MediaWords/KeyValueStore/DatabaseInline.pm
lib/MediaWords/KeyValueStore/MultipleStores.pm
lib/MediaWords/KeyValueStore/PostgreSQL.pm
lib/MediaWords/KeyValueStore/t/helpers
lib/MediaWords/KeyValueStore/t/helpers/amazon_s3_set_credentials_from_env.inc.pl
lib/MediaWords/KeyValueStore/t/helpers/amazon_s3_tests.inc.pl
lib/MediaWords/KeyValueStore/t/helpers/create_mock_download.inc.pl
lib/MediaWords/KeyValueStore/t/helpers/postgresql_tests.inc.pl
lib/MediaWords/Languages
lib/MediaWords/Languages/Language
lib/MediaWords/Languages/Language.pm
lib/MediaWords/Languages/Language/PythonWrapper.pm
lib/MediaWords/Languages/ca.pm
lib/MediaWords/Languages/da.pm
lib/MediaWords/Languages/de.pm
lib/MediaWords/Languages/en.pm
lib/MediaWords/Languages/es.pm
lib/MediaWords/Languages/fi.pm
lib/MediaWords/Languages/fr.pm
lib/MediaWords/Languages/ha.pm
lib/MediaWords/Languages/hi.pm
lib/MediaWords/Languages/hu.pm
lib/MediaWords/Languages/it.pm
lib/MediaWords/Languages/ja.pm
lib/MediaWords/Languages/lt.pm
lib/MediaWords/Languages/nl.pm
lib/MediaWords/Languages/no.pm
lib/MediaWords/Languages/pt.pm
lib/MediaWords/Languages/ro.pm
lib/MediaWords/Languages/ru.pm
lib/MediaWords/Languages/sv.pm
lib/MediaWords/Languages/tr.pm
lib/MediaWords/Languages/zh.pm
lib/MediaWords/Model
lib/MediaWords/Model/DBIS.pm
lib/MediaWords/MyFCgiManager.pm
lib/MediaWords/Solr
lib/MediaWords/Solr.pm
lib/MediaWords/Solr/Dump.pm
lib/MediaWords/Solr/PseudoQueries.pm
lib/MediaWords/Solr/Query.pm
lib/MediaWords/Solr/TagCounts.pm
lib/MediaWords/Solr/WordCounts.pm
lib/MediaWords/StoryVectors.pm
lib/MediaWords/TM
lib/MediaWords/TM.pm
lib/MediaWords/TM/FetchLink.pm
lib/MediaWords/TM/FetchTopicTweets.pm
lib/MediaWords/TM/GuessDate
lib/MediaWords/TM/GuessDate.pm
lib/MediaWords/TM/GuessDate/Result.pm
lib/MediaWords/TM/Mine.pm
lib/MediaWords/TM/Model.pm
lib/MediaWords/TM/RetweeterScores.pm
lib/MediaWords/TM/Snapshot
lib/MediaWords/TM/Snapshot.pm
lib/MediaWords/TM/Snapshot/GraphLayout.pm
lib/MediaWords/TM/Stories.pm
lib/MediaWords/Test
lib/MediaWords/Test/API.pm
lib/MediaWords/Test/DB
lib/MediaWords/Test/DB.pm
lib/MediaWords/Test/DB/HandlerProxy.pm
lib/MediaWords/Test/Data.pm
lib/MediaWords/Test/HTTP
lib/MediaWords/Test/HTTP/HashServer.pm
lib/MediaWords/Test/LocalServer.pm
lib/MediaWords/Test/Solr.pm
lib/MediaWords/Test/Supervisor.pm
lib/MediaWords/Test/Text.pm
lib/MediaWords/Test/TopicTweets.pm
lib/MediaWords/Test/Types.pm
lib/MediaWords/Test/URLs.pm
lib/MediaWords/Util
lib/MediaWords/Util/Annotator
lib/MediaWords/Util/Annotator/CLIFF.pm
lib/MediaWords/Util/Annotator/NYTLabels.pm
lib/MediaWords/Util/CSV.pm
lib/MediaWords/Util/Colors.pm
lib/MediaWords/Util/Compress.pm
lib/MediaWords/Util/Config.pm
lib/MediaWords/Util/DateTime.pm
lib/MediaWords/Util/ExtractText.pm
lib/MediaWords/Util/Facebook.pm
lib/MediaWords/Util/HTML.pm
lib/MediaWords/Util/IdentifyLanguage.pm
lib/MediaWords/Util/JSON.pm
lib/MediaWords/Util/Log.pm
lib/MediaWords/Util/Mail
lib/MediaWords/Util/Mail.pm
lib/MediaWords/Util/Mail/Message
lib/MediaWords/Util/Mail/Message.pm
lib/MediaWords/Util/Mail/Message/Templates
lib/MediaWords/Util/Mail/Message/Templates.pm
lib/MediaWords/Util/Mail/Message/Templates/AuthAPIKeyResetMessage.pm
lib/MediaWords/Util/Mail/Message/Templates/AuthActivatedMessage.pm
lib/MediaWords/Util/Mail/Message/Templates/AuthActivationNeededMessage.pm
lib/MediaWords/Util/Mail/Message/Templates/AuthPasswordChangedMessage.pm
lib/MediaWords/Util/Mail/Message/Templates/AuthResetPasswordMessage.pm
lib/MediaWords/Util/Mail/Message/Templates/TopicSpiderUpdateMessage.pm
lib/MediaWords/Util/Mail/Message/Templates/email-templates
lib/MediaWords/Util/Pages.pm
lib/MediaWords/Util/Paths.pm
lib/MediaWords/Util/Process.pm
lib/MediaWords/Util/Python.pm
lib/MediaWords/Util/SQL.pm
lib/MediaWords/Util/Tags.pm
lib/MediaWords/Util/Text.pm
lib/MediaWords/Util/Timing.pm
lib/MediaWords/Util/URL
lib/MediaWords/Util/URL.pm
lib/MediaWords/Util/URL/Variants.pm
lib/MediaWords/Util/Web
lib/MediaWords/Util/Web.pm
lib/MediaWords/Util/Web/Cache.pm
lib/MediaWords/Util/Web/UserAgent
lib/MediaWords/Util/Web/UserAgent.pm
lib/MediaWords/Util/Web/UserAgent/HTMLRedirects.pm
lib/MediaWords/Util/Web/UserAgent/Request.pm
lib/MediaWords/Util/Web/UserAgent/Response.pm
lib/MediaWords/Util/Word2vec
lib/MediaWords/Util/Word2vec.pm
lib/MediaWords/Util/Word2vec/SnapshotDatabaseModelStore.pm
lib/MediaWords/View
lib/MediaWords/View/TT.pm
log4perl.conf
mediacloud
mediacloud/mediawords
mediacloud/mediawords/__init__.py
mediacloud/mediawords/annotator
mediacloud/mediawords/annotator/__init__.py
mediacloud/mediawords/annotator/cliff.py
mediacloud/mediawords/annotator/nyt_labels.py
mediacloud/mediawords/db
mediacloud/mediawords/db/__init__.py
mediacloud/mediawords/db/copy
mediacloud/mediawords/db/copy/__init__.py
mediacloud/mediawords/db/copy/copy_from.py
mediacloud/mediawords/db/copy/copy_to.py
mediacloud/mediawords/db/exceptions
mediacloud/mediawords/db/exceptions/__init__.py
mediacloud/mediawords/db/exceptions/handler.py
mediacloud/mediawords/db/exceptions/result.py
mediacloud/mediawords/db/export
mediacloud/mediawords/db/export/__init__.py
mediacloud/mediawords/db/export/export_tables.py
mediacloud/mediawords/db/handler.py
mediacloud/mediawords/db/locks.py
mediacloud/mediawords/db/pages
mediacloud/mediawords/db/pages/__init__.py
mediacloud/mediawords/db/pages/pages.py
mediacloud/mediawords/db/result
mediacloud/mediawords/db/result/__init__.py
mediacloud/mediawords/db/result/result.py
mediacloud/mediawords/db/schema
mediacloud/mediawords/db/schema/__init__.py
mediacloud/mediawords/db/schema/schema.py
mediacloud/mediawords/db/schema/version.py
mediacloud/mediawords/dbi
mediacloud/mediawords/dbi/__init__.py
mediacloud/mediawords/dbi/auth
mediacloud/mediawords/dbi/auth/roles
mediacloud/mediawords/dbi/downloads.py
mediacloud/mediawords/dbi/stories.py
mediacloud/mediawords/job
mediacloud/mediawords/job/__init__.py
mediacloud/mediawords/job/cliff
mediacloud/mediawords/job/cliff/__init__.py
mediacloud/mediawords/job/cliff/fetch_annotation.py
mediacloud/mediawords/job/cliff/update_story_tags.py
mediacloud/mediawords/job/nyt_labels
mediacloud/mediawords/job/nyt_labels/__init__.py
mediacloud/mediawords/job/nyt_labels/fetch_annotation.py
mediacloud/mediawords/job/nyt_labels/update_story_tags.py
mediacloud/mediawords/job/similarweb
(farewell!)mediacloud/mediawords/job/similarweb/__init__.py
(farewell!)mediacloud/mediawords/job/similarweb/update_audience_data.py
(farewell!)mediacloud/mediawords/job/tm
mediacloud/mediawords/job/tm/extract_story_links_job.py
mediacloud/mediawords/job/tm/fetch_link_job.py
mediacloud/mediawords/job/word2vec
mediacloud/mediawords/job/word2vec/__init__.py
mediacloud/mediawords/job/word2vec/generate_snapshot_model.py
mediacloud/mediawords/key_value_store
mediacloud/mediawords/key_value_store/__init__.py
mediacloud/mediawords/key_value_store/amazon_s3.py
mediacloud/mediawords/key_value_store/cached_amazon_s3.py
mediacloud/mediawords/key_value_store/database_inline.py
mediacloud/mediawords/key_value_store/multiple_stores.py
mediacloud/mediawords/key_value_store/postgresql.py
mediacloud/mediawords/languages
mediacloud/mediawords/languages/__init__.py
mediacloud/mediawords/languages/ca
mediacloud/mediawords/languages/ca/__init__.py
mediacloud/mediawords/languages/ca/among.py
mediacloud/mediawords/languages/ca/basestemmer.py
mediacloud/mediawords/languages/ca/ca_stop_words.txt
mediacloud/mediawords/languages/ca/catalan_stemmer.py
mediacloud/mediawords/languages/ca/generate_python_class.sh
mediacloud/mediawords/languages/ca/snowball_stemmer
mediacloud/mediawords/languages/ca/snowball_stemmer/README.md
mediacloud/mediawords/languages/ca/snowball_stemmer/stemmer.java
mediacloud/mediawords/languages/ca/snowball_stemmer/stemmer.sbl
mediacloud/mediawords/languages/da
mediacloud/mediawords/languages/da/__init__.py
mediacloud/mediawords/languages/da/da_stop_words.txt
mediacloud/mediawords/languages/de
mediacloud/mediawords/languages/de/__init__.py
mediacloud/mediawords/languages/de/de_stop_words.txt
mediacloud/mediawords/languages/en
mediacloud/mediawords/languages/en/__init__.py
mediacloud/mediawords/languages/en/en_stop_words.txt
mediacloud/mediawords/languages/es
mediacloud/mediawords/languages/es/__init__.py
mediacloud/mediawords/languages/es/es_stop_words.txt
mediacloud/mediawords/languages/factory.py
mediacloud/mediawords/languages/fi
mediacloud/mediawords/languages/fi/__init__.py
mediacloud/mediawords/languages/fi/fi_stop_words.txt
mediacloud/mediawords/languages/fr
mediacloud/mediawords/languages/fr/__init__.py
mediacloud/mediawords/languages/fr/fr_stop_words.txt
mediacloud/mediawords/languages/ha
(farewell!)mediacloud/mediawords/languages/ha/__init__.py
(farewell!)mediacloud/mediawords/languages/ha/ha_stop_words.txt
(farewell!)mediacloud/mediawords/languages/hi
mediacloud/mediawords/languages/hi/__init__.py
mediacloud/mediawords/languages/hi/hi_stop_words.txt
mediacloud/mediawords/languages/hi/hindi-hunspell
mediacloud/mediawords/languages/hu
mediacloud/mediawords/languages/hu/__init__.py
mediacloud/mediawords/languages/hu/hu_stop_words.txt
mediacloud/mediawords/languages/it
mediacloud/mediawords/languages/it/__init__.py
mediacloud/mediawords/languages/it/it_stop_words.txt
mediacloud/mediawords/languages/ja
mediacloud/mediawords/languages/ja/__init__.py
mediacloud/mediawords/languages/ja/ja_stop_words.txt
mediacloud/mediawords/languages/lt
mediacloud/mediawords/languages/lt/__init__.py
mediacloud/mediawords/languages/lt/among.py
mediacloud/mediawords/languages/lt/basestemmer.py
mediacloud/mediawords/languages/lt/generate_python_class.sh
mediacloud/mediawords/languages/lt/lithuanian_stemmer.py
mediacloud/mediawords/languages/lt/lt_stop_words.txt
mediacloud/mediawords/languages/lt/snowball_stemmer
mediacloud/mediawords/languages/lt/snowball_stemmer/LICENSE
mediacloud/mediawords/languages/lt/snowball_stemmer/README.md
mediacloud/mediawords/languages/lt/snowball_stemmer/conservative.sbl
mediacloud/mediawords/languages/lt/snowball_stemmer/lithuanian.sbl
mediacloud/mediawords/languages/nl
mediacloud/mediawords/languages/nl/__init__.py
mediacloud/mediawords/languages/nl/nl_stop_words.txt
mediacloud/mediawords/languages/no
mediacloud/mediawords/languages/no/__init__.py
mediacloud/mediawords/languages/no/no_stop_words.txt
mediacloud/mediawords/languages/pt
mediacloud/mediawords/languages/pt/__init__.py
mediacloud/mediawords/languages/pt/pt_stop_words.txt
mediacloud/mediawords/languages/ro
mediacloud/mediawords/languages/ro/__init__.py
mediacloud/mediawords/languages/ro/ro_stop_words.txt
mediacloud/mediawords/languages/ru
mediacloud/mediawords/languages/ru/__init__.py
mediacloud/mediawords/languages/ru/ru_stop_words.txt
mediacloud/mediawords/languages/sv
mediacloud/mediawords/languages/sv/__init__.py
mediacloud/mediawords/languages/sv/sv_stop_words.txt
mediacloud/mediawords/languages/tr
mediacloud/mediawords/languages/tr/__init__.py
mediacloud/mediawords/languages/tr/tr_stop_words.txt
mediacloud/mediawords/languages/zh
mediacloud/mediawords/languages/zh/README.md
mediacloud/mediawords/languages/zh/__init__.py
mediacloud/mediawords/languages/zh/dict.txt.big
mediacloud/mediawords/languages/zh/userdict.txt
mediacloud/mediawords/languages/zh/zh_stop_words.txt
mediacloud/mediawords/similarweb
mediacloud/mediawords/similarweb/__init__.py
mediacloud/mediawords/similarweb/similarweb.py
mediacloud/mediawords/similarweb/tasks.py
mediacloud/mediawords/similarweb/test
mediacloud/mediawords/similarweb/test/__init__.py
mediacloud/mediawords/solr
mediacloud/mediawords/solr/__init__.py
mediacloud/mediawords/solr/query.py
mediacloud/mediawords/solr/run
mediacloud/mediawords/solr/run/__init__.py
mediacloud/mediawords/solr/run/constants.py
mediacloud/mediawords/solr/run/solr.py
mediacloud/mediawords/solr/run/zookeeper.py
mediacloud/mediawords/test
mediacloud/mediawords/test/__init__.py
mediacloud/mediawords/test/db
mediacloud/mediawords/test/db/__init__.py
mediacloud/mediawords/test/db/env.py
mediacloud/mediawords/test/db/handler_proxy.py
mediacloud/mediawords/test/http
mediacloud/mediawords/test/http/__init__.py
mediacloud/mediawords/test/http/hash_server.py
mediacloud/mediawords/tm
mediacloud/mediawords/tm/__init__.py
mediacloud/mediawords/tm/extract_story_links.py
mediacloud/mediawords/tm/fetch_link.py
mediacloud/mediawords/tm/fetch_topic_tweets.py
mediacloud/mediawords/tm/guess_date.py
mediacloud/mediawords/tm/media.py
mediacloud/mediawords/tm/mine.py
mediacloud/mediawords/tm/snapshot
mediacloud/mediawords/tm/snapshot/graph_layout.py
mediacloud/mediawords/tm/stories.py
mediacloud/mediawords/util
mediacloud/mediawords/util/__init__.py
mediacloud/mediawords/util/colors.py
mediacloud/mediawords/util/compress.py
mediacloud/mediawords/util/config.py
mediacloud/mediawords/util/extract_text.py
mediacloud/mediawords/util/html.py
mediacloud/mediawords/util/identify_language.py
mediacloud/mediawords/util/json.py
mediacloud/mediawords/util/log.py
mediacloud/mediawords/util/mail.py
mediacloud/mediawords/util/mail_message
mediacloud/mediawords/util/mail_message/templates.py
mediacloud/mediawords/util/network.py
mediacloud/mediawords/util/pages.py
mediacloud/mediawords/util/paths.py
mediacloud/mediawords/util/perl.py
mediacloud/mediawords/util/process.py
mediacloud/mediawords/util/sql.py
mediacloud/mediawords/util/text.py
mediacloud/mediawords/util/url
mediacloud/mediawords/util/url/__init__.py
mediacloud/mediawords/util/url/shorteners.py
mediacloud/mediawords/util/url/variants.py
mediacloud/mediawords/util/web
mediacloud/mediawords/util/web/__init__.py
mediacloud/mediawords/util/web/user_agent
mediacloud/mediawords/util/web/user_agent/__init__.py
mediacloud/mediawords/util/web/user_agent/html_redirects.py
mediacloud/mediawords/util/web/user_agent/request
mediacloud/mediawords/util/web/user_agent/request/__init__.py
mediacloud/mediawords/util/web/user_agent/request/request.py
mediacloud/mediawords/util/web/user_agent/response
mediacloud/mediawords/util/web/user_agent/response/__init__.py
mediacloud/mediawords/util/web/user_agent/response/response.py
mediacloud/mediawords/util/web/user_agent/throttled.py
mediacloud/mediawords/util/word2vec
mediacloud/mediawords/util/word2vec/__init__.py
mediacloud/mediawords/util/word2vec/exceptions.py
mediacloud/mediawords/util/word2vec/model_stores.py
mediacloud/mediawords/util/word2vec/sentence_iterators.py
mediacloud/requirements.txt
mediacloud/setup.cfg
mediacloud/snowball
mediacloud/test-data
mediacloud/test-data/ch
mediacloud/test-data/ch/ch-posts-2016-01-01.json
mediacloud/test-data/ch/ch-posts-2016-01-02.json
mediacloud/test-data/ch/ch-posts-2016-01-03.json
mediacloud/test-data/ch/ch-posts-2016-01-04.json
mediacloud/test-data/ch/ch-posts-2016-01-05.json
mediacloud/test-data/html
mediacloud/test-data/html/strip.html
mediawords.yml
mediawords.yml.dist
root
(farewell!)root/
some time ago, so now I think more or less every script / view in it is being used by some part of our "legacy" web UI. It would be nice to scrap the whole legacy website so we could remove all of those 8 or so different jQuery versions that reside here.root/auth
(farewell!)root/auth/email_authentication_needed.tt2
(farewell!)root/auth/forgot.tt2
(farewell!)root/auth/login.tt2
(farewell!)root/auth/profile.tt2
(farewell!)root/auth/register.tt2
(farewell!)root/auth/reset.tt2
(farewell!)root/auth/welcome.tt2
(farewell!)root/common
root/common/error_page.tt2
root/common/html_footer.tt2
root/common/html_head.tt2
root/common/third_party_libs.tt2
root/downloads
root/downloads/list.tt2
root/facet_counts
root/facet_counts/facet_counts.tt2
root/favicon.ico
root/feeds
root/feeds/batch_create.tt2
root/feeds/batch_edit_feeds.tt2
root/feeds/batch_edit_tags.tt2
root/feeds/delete.tt2
root/feeds/edit.tt2
root/feeds/edit_tags.tt2
root/feeds/list.tt2
root/feeds/scrape.tt2
root/forms
root/forms/admin
root/forms/admin/tm
root/forms/admin/tm/create_topic.yml
root/forms/admin/tm/focus.yml
root/forms/admin/tm/merge_media.yml
root/forms/admin/tm/merge_stories.yml
root/forms/admin/tm/topic_media_type.yml
root/forms/auth
root/forms/auth/changepass.yml
root/forms/auth/forgot.yml
root/forms/auth/login.yml
root/forms/auth/reset.yml
root/forms/edit_tags.yml.tt2
root/forms/feeds.yml
root/forms/login
root/forms/login/register.yml
root/forms/media.yml
root/forms/media_search.yml
root/forms/monitor
root/forms/monitor/crawler_google_data_table.yml
root/forms/monitor/view.yml
root/forms/scrape_feeds.yml
root/forms/story.yml
root/forms/tag.yml
root/forms/tag_set.yml
root/forms/term.yml
root/forms/terms.yml
root/forms/users
root/forms/users/create.yml
root/forms/users/edit.yml
root/forms/visualize.yml
root/gexf
root/gexf/.project
root/gexf/LICENSE
root/gexf/README.md
root/gexf/config.js
root/gexf/img
root/gexf/img/fleches-horiz.png
root/gexf/img/gephi.png
root/gexf/img/loupe-edges.png
root/gexf/img/plusmoins.png
root/gexf/img/search.gif
root/gexf/index.html
root/gexf/js
root/gexf/js/gexfjs.js
root/gexf/js/jquery-1.7.2.min.js
root/gexf/js/jquery-ui-1.8.16.custom.min.js
root/gexf/js/jquery.mousewheel.min.js
root/gexf/styles
root/gexf/styles/gexfjs.css
root/gexf/styles/jquery-ui.css
root/health
root/health/list.tt2
root/health/medium.tt2
root/health/stories.tt2
root/health/tag.tt2
root/health/tag_sets.tt2
root/include
root/include/auth
root/include/auth/footer.tt2
root/include/auth/header.tt2
root/include/auth/style.css
root/include/clusterstyle.css
root/include/feeds_header.tt2
root/include/final_tweaks.css
root/include/footer.tt2
root/include/google_analytics.tt2
root/include/header.tt2
root/include/header_standalone.tt2
root/include/images
root/include/images/backstripes.gif
root/include/images/containerback.gif
root/include/images/curve.gif
root/include/images/dots.gif
root/include/images/downarrow.gif
root/include/images/example-2.gif
root/include/images/header.jpg
root/include/images/mc-flow-2b.png
root/include/images/rightarrow.gif
root/include/images/spacer.gif
root/include/images/uparrow.gif
root/include/images/verticaldots.gif
root/include/jquery-1.5.1.js
root/include/jquery-ui-1.8.5.custom.min.js
root/include/jquery.iframe-auto-height.plugin.js
root/include/jquery.iframe-auto-height.plugin_msie_workaround.js
root/include/jquery.url.js
root/include/jquery.validate.js
root/include/libs
root/include/libs/handsontable
root/include/libs/jquery.ba-bbq.js
root/include/libs/jquery.query-2.1.7.js
root/include/libs/scrollTo
root/include/libs/scrollTo/changes.txt
root/include/libs/scrollTo/jquery.scrollTo-min.js
root/include/libs/scrollTo/jquery.scrollTo.js
root/include/libs/tag-it
root/include/libs/tag-it/LICENSE
root/include/libs/tag-it/README.markdown
root/include/libs/tag-it/TODO
root/include/libs/tag-it/css
root/include/libs/tag-it/css/examples.css
root/include/libs/tag-it/css/jquery.tagit.css
root/include/libs/tag-it/css/master.css
root/include/libs/tag-it/css/reset.css
root/include/libs/tag-it/css/tagit.ui-zendesk.css
root/include/libs/tag-it/examples.html
root/include/libs/tag-it/js
root/include/libs/tag-it/js/tag-it.js
root/include/libs/tag-it/screenshot.png
root/include/media_cloud_css.css
root/include/media_source_list
root/include/pager.tt2
root/include/smoothness
root/include/smoothness/images
root/include/smoothness/images/ui-bg_flat_0_aaaaaa_40x100.png
root/include/smoothness/images/ui-bg_flat_75_ffffff_40x100.png
root/include/smoothness/images/ui-bg_glass_55_fbf9ee_1x400.png
root/include/smoothness/images/ui-bg_glass_65_ffffff_1x400.png
root/include/smoothness/images/ui-bg_glass_75_dadada_1x400.png
root/include/smoothness/images/ui-bg_glass_75_e6e6e6_1x400.png
root/include/smoothness/images/ui-bg_glass_95_fef1ec_1x400.png
root/include/smoothness/images/ui-bg_highlight-soft_75_cccccc_1x100.png
root/include/smoothness/images/ui-icons_222222_256x240.png
root/include/smoothness/images/ui-icons_2e83ff_256x240.png
root/include/smoothness/images/ui-icons_454545_256x240.png
root/include/smoothness/images/ui-icons_888888_256x240.png
root/include/smoothness/images/ui-icons_cd0a0a_256x240.png
root/include/smoothness/jquery-ui-1.8.5.custom.css
root/include/style.css
root/include/users_header.tt2
root/include/vertically-aligned-ie.css
root/include/vertically-aligned.css
root/include/word_cloud.css
root/include/word_cloud_list.css
root/media
root/media/batch_edit_tags.tt2
root/media/create_batch.tt2
root/media/delete.tt2
root/media/edit.tt2
root/media/edit_tags.tt2
root/media/eval_rss_full_text.tt2
root/media/find_likely_full_text.tt2
root/media/moderate
root/media/moderate/media.tt2
root/media/moderate/merge.tt2
root/media/moderate/tags.tt2
root/media/search.tt2
root/nv
root/nv/README.txt
root/nv/config.json
root/nv/css
root/nv/css/style.css
root/nv/css/tablet.css
root/nv/htaccess_example
root/nv/images
root/nv/images/CC.png
root/nv/images/blank.gif
root/nv/images/fancybox_loading.gif
root/nv/images/fancybox_sprite.png
root/nv/images/info.png
root/nv/images/jisc-logo-small.png
root/nv/images/oii.png
root/nv/images/oii_brand.png
root/nv/images/oii_text.png
root/nv/images/rainbow.png
root/nv/images/sprite.png
root/nv/images/zoom_in.png
root/nv/images/zoom_out.png
root/nv/images/zoom_reset.png
root/nv/index.html
root/nv/js
root/nv/js/excanvas.js
root/nv/js/fancybox
root/nv/js/fancybox/jquery.fancybox.css
root/nv/js/fancybox/jquery.fancybox.pack.js
root/nv/js/jquery
root/nv/js/jquery/jquery.min.js
root/nv/js/main.js
root/nv/js/sigma
root/nv/js/sigma/_sigma.min.js
root/nv/js/sigma/parseGexf_fin.js
root/nv/js/sigma/sigma.js
root/nv/js/sigma/sigma.min.js
root/nv/js/sigma/sigma.parseGexf.js
root/nv/js/sigma/sigma.parseJson.js
root/nv/nv.tt2
root/nv/web.config
root/script
root/script/jquery-1.4.2.min.js
root/script/jquery.metadata.js
root/script/jquery.tablesorter-themes
root/script/jquery.tablesorter-themes/blue
root/script/jquery.tablesorter-themes/blue/asc.gif
root/script/jquery.tablesorter-themes/blue/bg.gif
root/script/jquery.tablesorter-themes/blue/desc.gif
root/script/jquery.tablesorter-themes/blue/style.css
root/script/jquery.tablesorter-themes/green
root/script/jquery.tablesorter-themes/green/asc.png
root/script/jquery.tablesorter-themes/green/bg.png
root/script/jquery.tablesorter-themes/green/desc.png
root/script/jquery.tablesorter-themes/green/style.css
root/script/jquery.tablesorter.min.js
root/script/non-free
root/script/non-free/amcharts3
root/search
root/search/diff.tt2
root/search/media.tt2
root/search/readme.tt2
root/search/search.tt2
root/search/tag_sets.tt2
root/search/tags.tt2
root/search/wc.tt2
root/static
root/static/images
root/static/images/btn_120x50_built.png
root/static/images/btn_120x50_built_shadow.png
root/static/images/btn_120x50_powered.png
root/static/images/btn_120x50_powered_shadow.png
root/static/images/btn_88x31_built.png
root/static/images/btn_88x31_built_shadow.png
root/static/images/btn_88x31_powered.png
root/static/images/btn_88x31_powered_shadow.png
root/static/images/catalyst_logo.png
root/stats
root/stats/media_tag_counts.tt2
root/stats/media_tag_counts_simple.tt2
root/stories
root/stories/add_tag.tt2
root/stories/delete_tag.tt2
root/stories/edit.tt2
root/stories/list.tt2
root/stories/retag.tt2
root/stories/tag.tt2
root/stories/view.tt2
root/tag_sets
root/tag_sets/edit.tt2
root/tags
root/tags/edit.tt2
root/tm
root/tm/activities.tt2
root/tm/add_focus.tt2
root/tm/add_media_type.tt2
root/tm/add_media_types.tt2
root/tm/create_topic.tt2
root/tm/delete_stories.tt2
root/tm/edit_dates.tt2
root/tm/edit_foci.tt2
root/tm/edit_media_type.tt2
root/tm/edit_media_types.tt2
root/tm/edit_topic.tt2
root/tm/header.tt2
root/tm/include
root/tm/include/latest_activities.tt2
root/tm/influential_media_words.tt2
root/tm/list.tt2
root/tm/media.tt2
root/tm/media_table.tt2
root/tm/medium.tt2
root/tm/merge_media.tt2
root/tm/merge_stories.tt2
root/tm/merge_stories_list.tt2
root/tm/mining_status.tt2
root/tm/model_reliability.tt2
root/tm/mot
root/tm/mot/d3.min.js
root/tm/mot/jquery.min.js
root/tm/mot/mot.tt2
root/tm/mot/scripts.js
root/tm/partisan.tt2
root/tm/remove_stories_confirm_js.tt2
root/tm/stories.tt2
root/tm/stories_table.tt2
root/tm/story.tt2
root/tm/story_stats.tt2
root/tm/story_tweets.tt2
root/tm/timespans_table.tt2
root/tm/unredirect_medium.tt2
root/tm/view.tt2
root/tm/view_snapshot.tt2
root/tm/view_timespan.tt2
root/tm/words.tt2
root/users
root/users/create.tt2
root/users/delete.tt2
root/users/edit.tt2
root/users/edit_tag_set_permissions.tt2
root/users/list.tt2
root/users/usage.tt2
root/visualize
root/visualize/search.tt2
schema
schema/mediawords.sql
db_row_last_updated
everywheremedia.moderated
columnmedia.moderation_notes
columnmedia.foreign_rss_links
columnmedia.is_not_dup
columnNULL
) boolean column with a negation in the name! I've found a script which sets this column but I'm not sure if it's ever read anywhere, so maybe scrap it? If it's useful, let's at least rename it (e.g. if the value isNULL
, does it mean that the media is / isn't duplicate, or has hasn't duplicate?)media.last_solr_import_date
columnmedia.is_monitored
columnmedia_name_trgm
,media_url_trgm
indexesfeeds.reparse
columnfeed_is_stale()
functionmedia_type_tags
table and its tagsmedia_rss_full_text_detection_data
table and related code in/admin/media/eval_rss_full_text
media_with_collections
viewstories_ap_syndicated
tabledownload_type
's enum valuesCalais
,calais
,spider_*
,archival_only
, and relevant rows indownloads
tabledownloads_in_old_format
,downloads_sites_downloads_id_pending
indexesdownloads_sites
viewfeedly_unscraped_feeds
viewschema/migrations
script
mediawords_
prefix for every script in this directory?script/database_schema_version.pl
script/export_import
script/export_import/export_feed_downloads_from_backup_crawler.pl
script/export_import/import_feed_downloads_to_db.pl
script/export_import/raw_download_content_column.inc.pl
script/generate_empty_sql_migration.sh
script/grant_mediacloud_ro_permissions.pl
script/jumpstart_perl_to_python.pl
script/mediawords_add_default_feeds.pl
script/mediawords_add_public_users.pl
(farewell!)script/mediawords_automark_full_text_rss.pl
(farewell!)script/mediawords_cgi.pl
script/mediawords_crawl.pl
script/mediawords_create_db.pl
script/mediawords_create_recent_indexes.pl
(farewell!)script/mediawords_dedup_topic_media.pl
script/mediawords_delete_extra_feeds.pl
(farewell!)script/mediawords_delete_media_and_feeds.pl
(farewell!)script/mediawords_download_and_handle_feed.pl
(farewell!)script/mediawords_dump_download_to_file.pl
(farewell!)script/mediawords_dump_story_search.pl
script/mediawords_dump_table.pl
script/mediawords_dump_topic.pl
script/mediawords_export_content_for_query.pl
script/mediawords_export_content_for_solr_query.pl
script/mediawords_extract_and_output.pl
(farewell!)script/mediawords_extract_and_vector_locally.pl
(farewell!)script/mediawords_extract_single_story.pl
(farewell!)script/mediawords_extract_youtube_embeds.pl
(farewell!)script/mediawords_fastcgi.pl
script/mediawords_fetch_facebook_url_counts.pl
(farewell!)script/mediawords_fetch_social_stats.pl
(farewell!)script/mediawords_fetch_url.pl
(farewell!)User-Agent
?script/mediawords_find_ap_mentions.pl
(farewell!)script/mediawords_find_corrupted_sequences.pl
(farewell!)script/mediawords_fix_dates.pl
(farewell!)script/mediawords_fix_wapo_urls.pl
(farewell!)script/mediawords_generate_daily_rss_dumps.pl
(farewell!)script/mediawords_generate_election_foci.pl
(farewell!)script/mediawords_generate_feedly_import_validation.pl
(farewell!)script/mediawords_generate_media_health.pl
script/mediawords_generate_media_stats.pl
script/mediawords_generate_monthly_topic_gexfs.pl
script/mediawords_generate_retweet_foci.pl
(farewell!)script/mediawords_generate_retweeter_csvs.pl
(farewell!)script/mediawords_generate_retweeter_scores.pl
(farewell!)script/mediawords_generate_supervisord_conf.pl
script/mediawords_generate_topic_story_words.pl
(farewell!)script/mediawords_generate_user_summary.pl
script/mediawords_generate_weekly_fake_news_report.pl
(farewell!)script/mediawords_guess_date.pl
script/mediawords_import_archive_org_tv_captions.pl
(farewell!)script/mediawords_import_downloads_and_stories.pl
(farewell!)script/mediawords_import_solr_data.pl
(farewell!)script/mediawords_import_stories.pl
script/mediawords_import_story_communities.pl
(farewell!)script/mediawords_import_topic_seed_urls.pl
(farewell!)script/mediawords_make_medium_full_text_rss.pl
(farewell!)script/mediawords_manage_users.pl
script/mediawords_mark_aggregator_media.pl
(farewell!)script/mediawords_mark_story_downloads_for_rextraction.pl
(farewell!)script/mediawords_mine_topic.pl
script/mediawords_perltidy_config_file
script/mediawords_print_queues.pl
(farewell!)script/mediawords_process_download_for_extractor.pl
(farewell!)script/mediawords_process_new_twitter_media.pl
(farewell!)script/mediawords_query_config.pl
script/mediawords_query_to_regexp.pl
(farewell!)script/mediawords_queue_jobs_from_query.pl
(farewell!)script/mediawords_reextract_downloads.pl
(farewell!)script/mediawords_reformat_all_code.sh
(farewell!)script/mediawords_reformat_code.pl
script/mediawords_refresh_mediacloud_stats.pl
script/mediawords_rehandle_feeds.pl
(farewell!)script/mediawords_reindex_db.pl
(farewell!)script/mediawords_rescrape_due_media.pl
script/mediawords_restore_download.pl
(farewell!)script/mediawords_revector_stories.pl
(farewell!)script/mediawords_run_dashboard_queries.pl
(farewell!)script/mediawords_run_tm_dump_query.pl
(farewell!)script/mediawords_scrape_feedly.pl
(farewell!)script/mediawords_server.pl
script/mediawords_set_media_tags.pl
(farewell!)script/mediawords_snapshot_topic.pl
script/mediawords_story_tags_from_seed_urls.pl
(farewell!)script/mediawords_swap_solr_live_collection.pl
(farewell!)script/mediawords_test_ap_detection.pl
(farewell!)script/mediawords_update_ap_syndication.pl
(farewell!)script/mediawords_upgrade_db.pl
script/mediawords_verify_downloads.pl
(farewell!)script/mediawords_write_random_story_contents.pl
(farewell!)script/pre_commit_hooks
script/pre_commit_hooks/README.mdown
script/pre_commit_hooks/apgdiff-2.5-pre.jar
script/pre_commit_hooks/apgdiff-license.txt
script/pre_commit_hooks/hook-db-schema-version.sh
script/pre_commit_hooks/hook-flake8.sh
script/pre_commit_hooks/hook-perl-syntax-formatting.sh
script/pre_commit_hooks/postgres-diff.sh
script/pre_commit_hooks/pre-commit
script/rabbitmq_wrapper.sh
script/run_compile_test.sh
script/run_dev_server.sh
script/run_fcgi_with_plackup.sh
script/run_in_env.sh
script/run_test_suite.sh
script/set_mc_root_dir.inc.sh
script/set_perlbrew_environment.sh
script/set_virtualenv_environment.sh
script/vagrant
script/vagrant/Vagrantfile
script/vagrant/aws_ec2_dummy.box
script/vagrant/provision_root.sh
script/vagrant/provision_user.sh
script/vagrant/run_install_test_suite_on_vagrant.sh
solr
solr/collections
solr/collections/README.txt
solr/collections/_base_collection
solr/collections/_base_collection/conf
solr/collections/_base_collection/conf/_rest_managed.json
solr/collections/_base_collection/conf/_schema_analysis_stopwords_english.json
solr/collections/_base_collection/conf/_schema_analysis_synonyms_english.json
solr/collections/_base_collection/conf/admin-extra.html
solr/collections/_base_collection/conf/admin-extra.menu-bottom.html
solr/collections/_base_collection/conf/admin-extra.menu-top.html
solr/collections/_base_collection/conf/clustering
solr/collections/_base_collection/conf/clustering/carrot2
solr/collections/_base_collection/conf/clustering/carrot2/README.txt
solr/collections/_base_collection/conf/clustering/carrot2/kmeans-attributes.xml
solr/collections/_base_collection/conf/clustering/carrot2/lingo-attributes.xml
solr/collections/_base_collection/conf/clustering/carrot2/stc-attributes.xml
solr/collections/_base_collection/conf/currency.xml
solr/collections/_base_collection/conf/elevate.xml
solr/collections/_base_collection/conf/lang
solr/collections/_base_collection/conf/lang/contractions_ca.txt
solr/collections/_base_collection/conf/lang/contractions_fr.txt
solr/collections/_base_collection/conf/lang/contractions_ga.txt
solr/collections/_base_collection/conf/lang/contractions_it.txt
solr/collections/_base_collection/conf/lang/hyphenations_ga.txt
solr/collections/_base_collection/conf/lang/stemdict_nl.txt
solr/collections/_base_collection/conf/lang/stoptags_ja.txt
solr/collections/_base_collection/conf/lang/stopwords_ar.txt
solr/collections/_base_collection/conf/lang/stopwords_bg.txt
solr/collections/_base_collection/conf/lang/stopwords_ca.txt
solr/collections/_base_collection/conf/lang/stopwords_ckb.txt
solr/collections/_base_collection/conf/lang/stopwords_cz.txt
solr/collections/_base_collection/conf/lang/stopwords_da.txt
solr/collections/_base_collection/conf/lang/stopwords_de.txt
solr/collections/_base_collection/conf/lang/stopwords_el.txt
solr/collections/_base_collection/conf/lang/stopwords_en.txt
solr/collections/_base_collection/conf/lang/stopwords_es.txt
solr/collections/_base_collection/conf/lang/stopwords_eu.txt
solr/collections/_base_collection/conf/lang/stopwords_fa.txt
solr/collections/_base_collection/conf/lang/stopwords_fi.txt
solr/collections/_base_collection/conf/lang/stopwords_fr.txt
solr/collections/_base_collection/conf/lang/stopwords_ga.txt
solr/collections/_base_collection/conf/lang/stopwords_gl.txt
solr/collections/_base_collection/conf/lang/stopwords_hi.txt
solr/collections/_base_collection/conf/lang/stopwords_hu.txt
solr/collections/_base_collection/conf/lang/stopwords_hy.txt
solr/collections/_base_collection/conf/lang/stopwords_id.txt
solr/collections/_base_collection/conf/lang/stopwords_it.txt
solr/collections/_base_collection/conf/lang/stopwords_ja.txt
solr/collections/_base_collection/conf/lang/stopwords_lv.txt
solr/collections/_base_collection/conf/lang/stopwords_nl.txt
solr/collections/_base_collection/conf/lang/stopwords_no.txt
solr/collections/_base_collection/conf/lang/stopwords_pt.txt
solr/collections/_base_collection/conf/lang/stopwords_ro.txt
solr/collections/_base_collection/conf/lang/stopwords_ru.txt
solr/collections/_base_collection/conf/lang/stopwords_sv.txt
solr/collections/_base_collection/conf/lang/stopwords_th.txt
solr/collections/_base_collection/conf/lang/stopwords_tr.txt
solr/collections/_base_collection/conf/lang/userdict_ja.txt
solr/collections/_base_collection/conf/mapping-FoldToASCII.txt
solr/collections/_base_collection/conf/mapping-ISOLatin1Accent.txt
solr/collections/_base_collection/conf/params.json
solr/collections/_base_collection/conf/protwords.txt
solr/collections/_base_collection/conf/schema.xml
solr/collections/_base_collection/conf/solrconfig.xml
solr/collections/_base_collection/conf/spellings.txt
solr/collections/_base_collection/conf/stopwords.txt
solr/collections/_base_collection/conf/synonyms.txt
solr/collections/_base_collection/conf/update-script.js
solr/collections/_base_collection/conf/velocity
solr/collections/_base_collection/conf/velocity/README.txt
solr/collections/_base_collection/conf/velocity/VM_global_library.vm
solr/collections/_base_collection/conf/velocity/browse.vm
solr/collections/_base_collection/conf/velocity/cluster.vm
solr/collections/_base_collection/conf/velocity/cluster_results.vm
solr/collections/_base_collection/conf/velocity/debug.vm
solr/collections/_base_collection/conf/velocity/did_you_mean.vm
solr/collections/_base_collection/conf/velocity/error.vm
solr/collections/_base_collection/conf/velocity/facet_fields.vm
solr/collections/_base_collection/conf/velocity/facet_pivot.vm
solr/collections/_base_collection/conf/velocity/facet_queries.vm
solr/collections/_base_collection/conf/velocity/facet_ranges.vm
solr/collections/_base_collection/conf/velocity/facets.vm
solr/collections/_base_collection/conf/velocity/footer.vm
solr/collections/_base_collection/conf/velocity/head.vm
solr/collections/_base_collection/conf/velocity/header.vm
solr/collections/_base_collection/conf/velocity/hit.vm
solr/collections/_base_collection/conf/velocity/hit_grouped.vm
solr/collections/_base_collection/conf/velocity/hit_plain.vm
solr/collections/_base_collection/conf/velocity/join_doc.vm
solr/collections/_base_collection/conf/velocity/jquery.autocomplete.css
solr/collections/_base_collection/conf/velocity/jquery.autocomplete.js
solr/collections/_base_collection/conf/velocity/layout.vm
solr/collections/_base_collection/conf/velocity/main.css
solr/collections/_base_collection/conf/velocity/mime_type_lists.vm
solr/collections/_base_collection/conf/velocity/pagination_bottom.vm
solr/collections/_base_collection/conf/velocity/pagination_top.vm
solr/collections/_base_collection/conf/velocity/product_doc.vm
solr/collections/_base_collection/conf/velocity/query.vm
solr/collections/_base_collection/conf/velocity/query_form.vm
solr/collections/_base_collection/conf/velocity/query_group.vm
solr/collections/_base_collection/conf/velocity/query_spatial.vm
solr/collections/_base_collection/conf/velocity/results_list.vm
solr/collections/_base_collection/conf/velocity/richtext_doc.vm
solr/collections/_base_collection/conf/velocity/suggest.vm
solr/collections/_base_collection/conf/velocity/tabs.vm
solr/collections/_base_collection/conf/xslt
solr/collections/_base_collection/conf/xslt/example.xsl
solr/collections/_base_collection/conf/xslt/example_atom.xsl
solr/collections/_base_collection/conf/xslt/example_rss.xsl
solr/collections/_base_collection/conf/xslt/luke.xsl
solr/collections/_base_collection/conf/xslt/updateXml.xsl
solr/collections/collection1
solr/collections/collection1/conf
solr/collections/collection2
(farewell!)solr/collections/collection2/conf
(farewell!)collection2
(the "test" collection) should be removed, together with a bunch of error-prone code which "swaps" collections between "production" and "testing".solr/contexts
solr/contexts/solr-jetty-context.xml
solr/etc
solr/etc/jetty-http.xml
solr/etc/jetty-https.xml
solr/etc/jetty-ssl.xml
solr/etc/jetty.xml
solr/etc/webdefault.xml
solr/modules
solr/modules/http.mod
solr/modules/https.mod
solr/modules/server.mod
solr/modules/ssl.mod
solr/resources
solr/resources/jetty-logging.properties
solr/resources/log4j.properties
solr/solr.xml
supervisor
supervisor/supervisorctl.sh
supervisor/supervisord.conf
supervisor/supervisord.conf.tt2
supervisor/supervisord.sh
tools
tools/benchmark
tools/benchmark/benchmark_date_guessing.py
(farewell!)tools/benchmark/benchmark_html_strip.py
(farewell!)tools/benchmark/benchmark_topic_regex.py
(farewell!)tools/cpan
tools/cpan/mirror-cpan-on-s3.sh
tools/db
tools/db/copy_nonpartitioned_sentences_to_partitions.py
tools/db/create_default_db_user_and_databases.sh
tools/db/create_missing_partitions.py
tools/db/export_import
tools/db/export_import/export_tables_to_backup_crawler.py
tools/db/postgresql_helpers.inc.sh
tools/db/purge_mediacloud_databases.sh
tools/db/purge_object_caches.py
tools/graph
tools/graph/generate_gexf_from_csv.py
(farewell!)tools/graph/layout_with_fa2.py
(farewell!)tools/similarweb
(farewell!)tools/similarweb/add_all_media_to_similarweb_queue.py
(farewell!)tools/solr
tools/solr/run
tools/solr/run/optimize_solr_index.py
tools/solr/run/reload_solr_shards.py
tools/solr/run/run_solr_shard.py
tools/solr/run/run_solr_standalone.py
tools/solr/run/run_zookeeper.py
tools/solr/run/update_zookeeper_config.py
tools/solr/run/upgrade_lucene_index.py
tools/solr/solr-to-re.py
tools/supervisor
tools/supervisor/rotate_supervisor_logs.py
tools/web
tools/web/rotate_http_request_log.py