Open brainwane opened 6 years ago
@lgh2 just chatted with Ernest and got some notes that I'll be turning into a PR. Here are those notes for reference -- they are very rough because I requested very quick notes, so that's my fault, not hers:
Warehouse uses the Pyramid web framework, the SQLAlchemy ORM, and Postgres for its database. Warehouse's front end uses Jinja2 templates.
The application exists within two Docker containers, one of which contains static files for the website, and the other which contains the Python web application code running in a virtual environment and the database. In the development environment, Docker Compose manages running the containers and the connections between them.
The top-level directory of the Warehouse repo contains a number of files. Among them are the license file, contributing.rst
and readme
. The requirements.txt
file is for the Warehouse virtual environment. The Dockerfile creates the Docker containers that Warehouse runs in, and the docker-compose yml file configures docker compose. Test configuration is in setup.cfg
. Heroku uses runtime.txt
. The makefile contains commands to spin up Docker compose and the Docker containers. There are also some files associated with Warehouse's front end.
# add files
Since Warehouse was built on top of a pre-existing database, some of the code in the ORM may not look like code from SQLAlchemy’s documentation in order to make it fit the existing tables. There are some places where joins are done using logic instead of a foreign key.
Warehouse also uses Pyramid’s hybrid URL traversal and dispatch. Using factory classes, URLs are pre-populated before the view is requested.
bin/ - high-level scripts for Docker
dev/ - assets for dev env
tests/ - tests
warehouse/ - code in modules
legacy/ - most of the implementation
forklift/ - APIs for upload
accounts/ - user accounts
admin/ - administrator-specific
cache/ - Warehouse - more goes out than goes in - cache as much as possible
classifiers/ - frame classifiers
cli/ - entry scripts
i18n/ - internationalization
locales/ - internationalization
manage/ - DB
migrations/ - DB
packaging/ - models
- rate limiting to prevent abuse
- RSS feeds
- site maps
utils/
sql - some code not in docs because relations already existed
some use logic and not foreign key and are joined on names (this may change)
factory methods prepopulate url before view requested
Pyramid hybrid URL Traversal and Dispatch:
https://docs.pylonsproject.org/projects/pyramid/en/latest/narr/hybrid.html
Pyramid: https://docs.pylonsproject.org/projects/pyramid/en/latest/index.html SQLAlchemy: https://docs.sqlalchemy.org/en/latest/ Postgres: https://www.postgresql.org/docs/
Docker: https://docs.docker.com/
Docker Compose: https://docs.docker.com/compose/overview/
It would be great if this documentation also explained what files/directories/libraries Warehouse uses to produce its various APIs.
@brainwane Could you outline what was missing from https://github.com/pypa/warehouse/pull/2937 that would fully resolve this issue?
Thanks for asking @di. I'd like the Warehouse developer documentation to include:
pypi-theme
)In today's Warehouse developers' meeting we decided to pare down our near-future milestones on our development roadmap so they really only contain the essential bugfixes and features we need to launch, replace legacy PyPI, and shut down the old site. So I'm moving this issue into a milestone further in the future.
While talking with @brainwane on the IRC, I came up with two ideas:
I think a Glossary regarding terms like "project, distribution, maintainer" could be helpful to clear confusions between similar concepts and synonims found both in the codebase and the docs. e.g. project, distribution, package, version, author, maintainer, etc.
Also, I think it would be valuable to include architecture beyond the codebase, and include things like design preferences for tests, how the docker containers are setup right now, descriptions with detail of what each make
command does, and other "development" parts of the workflow for completeness. Adding things besides the code layout that are also part of the system. :)
We should be careful (I almost made the mistake myself) with mixing contribution guidelines with the system architecture, design choices and codebase information.
I'm reconsidering the directory layout specifying what each subdirectory concerns itself with as that is almost guaranteed to change over time and become out of date.
The Glossary might provide enough context on what the module names mean, and a basic primer on Pyramid app/module layout would probably suffice.
Just my unqualified 2 cents as a first-time user of warehouse: For me, the directory structure and the "assumptions and concepts" block were the most helpful parts of the documentation once I was set up and trying to get my bearings, because it was helpful in figuring out where to start exploring.
Let's update our application structure overview with a writeup like Zulip's architecture summary or a curated list of links to conference talks, blog posts, etc. that would get us 30% of the way towards a history and application overview like this MediaWiki overview. We'd mention frameworks and components we use, like:
and engineering approaches we recommend people know about as they learn Warehouse.
Reasoning: Developers who are new to a codebase need is to know the design rationale of confusing bits -- why it was made this way, what decisions are embedded in particular choices, whether particular components are the result of a feature request, a quick fix after an outage, an experiment, etc. (This is based on research summarized in Making Software.)
Discussed a bit on the pypa-dev mailing list.