dwyl / github-backup

:octocat: :back: 🆙 Backup your GitHub Issues so you can still work when (they/you are) offline.
https://github-backup.herokuapp.com
GNU General Public License v2.0
32 stars 3 forks source link

Add alog to application #127

Open RobStallion opened 5 years ago

RobStallion commented 5 years ago

I am thinking of adding alog to this project.

When this project was created alog did not exist. This project has similar features to alog in that it does not update any of the comments on an issue, instead it adds a new version of the comment to the versions table in the database.

However if an issue name is ever updated then github-backup just updates the record in the database.

With alog we can easily make each table of this application (that needs to be append-only) append-only.

RobStallion commented 5 years ago

With alog in the project we would be able to remove the versions table and replace it by adding :entry_id to the comments table.

Pros:

We can also add :entry_id to any other tables to make them append-only.

Cons:

May be tricky to transfer any existing data from the versions table into the comments table (not sure if this is needed however as I do not think that github-backup is being used on any project at the moment)

@nelsonic do you have any thoughts on the above?

nelsonic commented 5 years ago

@RobStallion This is a good question. Github issue data is a good use-case for an append only log. 👍 However the first step is to define the Schema for the data so that we can have a table for storing the data that was previously stored in S3 #126 ... How do you feel about doing the "mapping" between what is returned by the GitHub API and creating a schema for the data?

RobStallion commented 5 years ago

@nelsonic That sounds like a good plan of action to me

RobStallion commented 5 years ago

Current version

Stores the comment text on s3 in a JSON object.

A new JSON object is created for each issue. The name of the JSON file is the github issue id.

The keys of the object come from the postgres versions table ID field (the key is a version_id).

Remove S3 - store comments in postgres

If we remove S3 then the comments table would need to store the comment.body.

As we need to keep a record of every update to a comment on GitHub (so we can see previous versions of a comment) we cannot update our existing comment in postgres. Instead we can insert another comment into the table. As all comments come with their own unique ID from github (comment_id) we will easily be able to pull out all the versions of a comment from our comments table based on the commend_id. Club soda, which is using an append-only-log, does something similar to this but with a field called entry_id. See here for an example schema.

This will also allow us to remove the versions table from out postgres database as it's 2 main reasons for being created will no longer be used. The versions table also currently stores the author of a comment (links to the users table). This could just be moved across into the updated comments table

RobStallion commented 5 years ago

Comments Schema

Current


  schema "comments" do
    field :comment_id, :string
    field :deleted, :boolean
    belongs_to :issue, Issue
    belongs_to :user, User, foreign_key: :deleted_by
    has_many :versions, Version

    timestamps()
  end

Possible update


  schema "comments" do
    field :comment, :string # could be call comment_text / comment_body
    field :comment_id, :string
    field :deleted, :boolean
    belongs_to :issue, Issue
    belongs_to :user, User, foreign_key: :deleted_by
    belongs_to :user, User, foreign_key: :author

    timestamps()
  end

As you can see, not much is needed to change to make an append only log style work with this data.

RobStallion commented 5 years ago

I do have one small issue. I believe that alog currently requires there to be an :entry_id in order for it to work. Our :entry_id would be the :comment_id. This is not a breaking change but it would probably be more clear to developers if we could label the field :comment_id.

This may be something we can open a issue on alog for. Want to look into this first to double check this is the case.

@nelsonic do you have any thoughts/comments on the above?

RobStallion commented 5 years ago

Response to self. alog does make you use an :entry_id. See below

https://github.com/dwyl/alog/blob/master/lib/alog.ex#L34-L39