citybound / modstore

A webapp to discover, rate and publish mods for Citybound
2 stars 2 forks source link

Datastore requirements for mod data and metadata #3

Open chances opened 7 years ago

chances commented 7 years ago

Modstore Data Requirements

Data Objects / Sources

(Copied from Data Objects / Sources in the README)

A Mod Archive

Manifest

README.md

CHANGELOG.md

Database Records

Mod

Mod Rating

Cross-platform Compilation Service

Cross-platform binary downloadables

Database Solutions

Both Mongo and Postgres offer the same solution, but Postgres affords stronger consistency guarantees. However, if high availability is a concern, then Mongo could be a good choice despite its potential for inconsistency.

Mongo Considerations

Mongo is still "eventually consistent."

This jumps out as a disadvantage to choosing Mongo: "Reads from secondaries can be useful in scenarios where it is acceptable for data to be slightly out of date."

It seems necessary that records for mod dependencies should never be out of date as this could complicate mod downloading/installing. If a request for a group of dependencies for a particular mod yield out of date results, this could impact the performance of the server because it would then have to try again to get a required version of a dependency a second (or more) times.

Postgres Considerations

Is Postgres NoSQL Better Than MongoDB?

TBD

Questions

Mod Archive Storage

n1313 commented 7 years ago

Here're my two cents:

  1. The modstore primary concern should be:

    • to allow authors to submit their work, publish, unpublish, update and delete it
    • to allow gamers to install the mods in an easy and safe way without leaving the game client
    • to allow gamers to browse the collection of mods, search for mods, discover new mods, and rate them using a standard web browser
    • to allow moderators to curate contents of the modstore: promote mods, create lists of recommendations and respond to abuse or violations
  2. The modstore should have a basic user management system with standard features like registration, login, logout, change of password and so on. Submissions should be only accepted from an authenticated user.

  3. The modstore should expose an API for browsing through the mod collection: search by name and keywords, sorting by popularity or date, checking for updates and so on. The game client should limit itself to only basic operations with it (search, update, install?) and not attempt to display any potentially dangerous media such as screenshots, HTML markup and so on.

  4. The mod should be exported to the game as a single archive file of predefined format (ZIP?) with predefined content structure and a single manifest file of predefined format (JSON?) with mandatory and optional fields. So, for example, http://cityboundsim.com/mods/${mod_id}/manifest.json with something like

    {
      id: 'mybestmod', // an alphanumberic string unique to this modstore
      name: 'My best mod',
      author: 'John Doe',
      email: 'john@doe.com',
      version: '2.3.2', // semver
      updated: '2017-01-25T13:00:00Z', // ISO8601
      description: 'Description of the mod/nNo fancy markup, just good old plain text',
      url: 'http://cityboundsim.com/mods/mybestmod',
      package: 'http://cityboundsim.com/mods/mybestmod/package.zip',
      dependencies: {
        'citybound': '1.0.x', // a range of versions
        'another-mod': '0.2.4', // exact version number
      },
      meta: { // a collection of things that are not required for installation
        homepage: 'http://github.com/johndoe/mybestmod',
        screenshots: [], // should not be displayed in-game because of possible security complications
        rating: 4.5,
        downloads: 10000,
        tags: [
          'mod', // as opposed to, say, 'total-conversion', 'model' or 'scenario'
          'cats',
          'dogs'
        ],
        license: 'MIT'
      }
    }
  5. The modstore should accept submissions in the form of archive files. The author must fill out the submission form, and the mod manifest file will be generated based on form data. The decision on where to host the archives should be made after we know more about how big the mods are on average and what kind of traffic we are talking about.

    Another option would be to allow submission of manifest files and let authors host their mod files themselves, but this opens up a bunch of potential problems with accessing those external files. Yet another option would be to force integration with something like github, and make authors submit their repo urls, so that modstore would pull the sources and build the mod, but that is way too complicated, in my opinion (and also prohibits closed-source mods).

  6. The modstore should only do basic validation of archives (for file corruption?). No compatibility or security checks are to be made, since this is a responsibility of the game client itself.

  7. The decision on what data store engine to choose should be made based primarily on practical concerns such as cost of operations and familiarity to the developers. It is highly unlikely that the modstore will see any significant (in real world numbers) amount of traffic any time soon, so even something as dumb as writing JSONs manually to the disk and then manually iterating through them for search will do, as long as it can be easily understood and maintained by an average dev.

  8. The modstore should store old versions of the mods along the most recent versions, as long as there is a published dependent of that particular version, or until the mod author decides to delete it.

chances commented 7 years ago

@n1313

The decision on where to host the archives should be made after we know more about how big the mods are on average and what kind of traffic we are talking about.

I think this would be a mistake. A decision will have to be made upfront before we have any real metrics about the average size of mods and average server load. There needs to be infrastructure in place initially for the pioneers that upload real mods, even in some alpha period. If nothing else, perhaps we should prioritize an initial storage solution that is pluggable? That way if it's determined that the first choice was a bad one, a better option can be swapped in without much fuss.

Yet another option would be to force integration with something like github ... but that is way too complicated ...

This issue of integration was brought up the conversation on gitter. They're right, integrations should be secondary to the primary goals of the modstore. The integrations should enhance an already good foundation that the modstore provides.

The decision on what data store engine to choose should be made based primarily on practical concerns such as cost of operations and familiarity to the developers. It is highly unlikely that the modstore will see any significant (in real world numbers) amount of traffic any time soon ...

I agree. I love over engineering things as much as the next developer, but making a reasonable choice here is key. We should prioritize simplicity and a developer experience that is as friction-less as possible.

Siding with a tried and tested datastore solution would be a good bet, I think.

n1313 commented 7 years ago

A decision will have to be made upfront before we have any real metrics about the average size of mods and average server load.

Let me rephrase my comment. I agree that the initial decision will indeed have to be made before the first mod is created, but it should be made based on the same practical concerns as the choice of DB. The initial load and storage requirements will be minimal, and we should think of moving to a "proper" hosting solution when we have a better idea of the average mod size and so on.

an initial storage solution that is pluggable

No matter what is chosen, the end result will be a direct link to the package file. If this link is a part of the manifest file, then it will be possible to easily change the storage provider: we'll only need to update the links in manifests after migrating.

Overall, for both storage and DB, my suggestion would be to go with whatever there is on the Anselm's webserver that hosts the website today. If it is MySQL, PHP and move_uploaded_file() then move_uploaded_file() it is.

n1313 commented 7 years ago

According to Anselm,