dwyl / github-backup

:octocat: :back: ๐Ÿ†™ Backup your GitHub Issues so you can still work when (they/you are) offline.
https://github-backup.herokuapp.com
GNU General Public License v2.0
32 stars 3 forks source link

Backup all issues/comments on install #65

Closed SimonLab closed 6 years ago

SimonLab commented 6 years ago

Linked to #24 (get all the issues)

As a user I want all the issues and comments of my repository to be saved when the Github app is installed So that I can access a backup of all the issues since the creation of my repository

This feature will send a lot of request to the Github api and to S3 so we might want to look if we can group this request but as this is a background process the speed of this actions is not a blocker

SimonLab commented 6 years ago

The payload send by Github has changed a bit and instead of having a repository key an array of repositories is returned (just one for one installation): image

install app on all repositories: image

The X-GitHub-Event header value is also different depending if the app is installed on only one repo or on all the repository:

also: image https://developer.github.com/changes/2017-05-22-github-apps-production-ship/

SimonLab commented 6 years ago

For each repositories where the app has been installed we are building a Map:

          %{
            repository: r,
            issues: issues,
            comments: comments
          }

where comment has the followinb format: image

We now need to create a schema for each issues and link the data together. We can link comment and issue together based on issue_url value.

SimonLab commented 6 years ago

The deprecated header X-GitHub-Event: integration_installation is still send on top of the event X-GitHub-Event: installation. So we still need to build a case for integration_installation which will be discarded but this will allow us to avoid any errors on our server: image

SimonLab commented 6 years ago

Repo.insert_allhttps://hexdocs.pm/ecto/Ecto.Repo.html#c:insert_all/3 function on all the issues might not be the right solution: image

It seems that building the issues with the comments associated to them might not work:

%{issue_id: 1, comments: [...]}

From the doc: "It is also not possible to use insert_all to insert across multiple tables, therefore associations are not supported."

SimonLab commented 6 years ago

Also from the documentation of the insert_all function: " However any other autogenerated value, like timestamps, wonโ€™t be autogenerated when using insert_all/3. This is by design as this function aims to be a more direct way to insert data into the database without the conveniences of insert/2"

Manually inserting the timstamp might not be an issue in our case as we want the timestamp to be the one when the issue was created on Github not when the issue is inserted in our database

SimonLab commented 6 years ago

To insert all the issues with their associations (comments and versions) we might need to break down the steps similar to:

post = Ecto.Changeset.change(%Post{}, title: "Hello", body: "world")
comment = Ecto.Changeset.change(%Comment{}, body: "Excellent!")
post_with_comments = Ecto.Changeset.put_assoc(post, :comments, [comment])
Repo.insert!(post_with_comments)

Or we could maybe use transaction:

Repo.transaction fn ->
  post = Repo.insert!(%Post{title: "Hello", body: "world"})
  # Build a comment from the post struct
  comment = Ecto.build_assoc(post, :comments, body: "Excellent!")
  Repo.insert!(comment)
end

ref: http://blog.plataformatec.com.br/2015/08/working-with-ecto-associations-and-embeds/ ("Manipulating associations" section)

nelsonic commented 6 years ago

@SimonLab looking good. ๐ŸŽ‰