paul-tqh-nguyen / arxiv_as_a_newspaper

arxiv.org portrayed as if it were a news paper.
0 stars 0 forks source link

Do a first pass at adding details to the README #1

Closed paul-tqh-nguyen closed 5 years ago

paul-tqh-nguyen commented 5 years ago

This does not have to be the final draft, but it would be nice if there were more technical details in there, e.g. the purpose of the project, tools used, algorithms used, etc.

This task is extremely limited in scope.

paul-tqh-nguyen commented 5 years ago

Progress Patch: https://github.com/paul-tqh-nguyen/arxiv_as_a_news_paper/commit/c6df2501e903d3ab22c422c6d0e09aa93da48716 This patch adds an overall summary, an initial commit of our architecture, and a brief description of our ETL process.

paul-tqh-nguyen commented 5 years ago

Progress Patch: https://github.com/paul-tqh-nguyen/arxiv_as_a_news_paper/commit/dee57064085c48352f9cff917348cdcc65ee472f extends the changes made in https://github.com/paul-tqh-nguyen/arxiv_as_a_news_paper/commit/c6df2501e903d3ab22c422c6d0e09aa93da48716 by adding links to the external tools we use referenced in our README.

paul-tqh-nguyen commented 5 years ago

Progress Patch: https://github.com/paul-tqh-nguyen/arxiv_as_a_newspaper/commit/bdc94a5c26b8dc3e047a2e8bd5d782389ae57a87 This patch includes stubs for the remainder of the README as we currently see it. This patch includes stubs for our main CLI interface for initiating our ETL process, our front end server, etc.

paul-tqh-nguyen commented 5 years ago

Progress Patch: https://github.com/paul-tqh-nguyen/arxiv_as_a_newspaper/commit/604bf3a35b683d7ab45e29cabb3bec247a949c9a This change to our README adds some extra details regarding what the top-level commands via our CLI do.

This patch also changes some of the CLI commands to be more self-documenting of what they do, e.g. we now have different commands for doing the ETL and writing to our DB and writing to some local JSON file.

paul-tqh-nguyen commented 5 years ago

Progress Patch: https://github.com/paul-tqh-nguyen/arxiv_as_a_newspaper/commit/3785c16b0264f9a9f7e4ba0f55f40b95445e294c

This updates the README to be more consistent with the shallow implementation of the support we have so far via https://github.com/paul-tqh-nguyen/arxiv_as_a_newspaper/commit/dfc87e83062b694efb1f82848c372437bf66d680

paul-tqh-nguyen commented 5 years ago

Progress Patch: https://github.com/paul-tqh-nguyen/arxiv_as_a_newspaper/commit/f14fd5e829a8732fd71ba15d43ad587db06515a5

As part of #2 progress, we want to be able to scrape https://arxiv.org/ using BeautifulSoup.

Using pip3 + bs4 alone doesn't get all the desired functionality.

We also need lxml installed as well.

This patch updates the README with tips on how to do that.

paul-tqh-nguyen commented 5 years ago

Progress Patch: https://github.com/paul-tqh-nguyen/arxiv_as_a_newspaper/commit/57ef9892c359991348e08c4ffec3e7d36d9cf7ab

This patch changes the --help functionality of our top-level CLI in a minor fashion by adding details about credentials and fixing punctuation issues.

paul-tqh-nguyen commented 5 years ago

Progress Patch: https://github.com/paul-tqh-nguyen/arxiv_as_a_newspaper/commit/543963dec4349ddb7bceafd8da0c44a234c5e64e

The README specifies that the top-level program that interfaces with our utilities is arxiv_as_a_newspaper.py.

We were previously supporting arxiv_as_a_news_paper.py.

This has been fixed.

paul-tqh-nguyen commented 5 years ago

Progress Patch: https://github.com/paul-tqh-nguyen/arxiv_as_a_newspaper/commit/2e0a6020e7e9d48cc1c16dc4d436f0a1f6a9aedc

This is a minor change.

The README has been updated so that more of the used vocabulary is linked to the relevant Wikipedia articles.

paul-tqh-nguyen commented 5 years ago

I believe that we've gotten our README in a state currently where we can say that the first pass is done.