Saving Comment Richtext Data to S3

Cleop commented 6 years ago

Relates to #18

We are going to use S3 to save the text of comments.

Implementation suggestion from @nelsonic:

Our Phoenix App will create a {commentid}.md file in the S3 bucket using ex-aws and store a reference to it in the comment_history table.

Implementation suggestion from @SimonLab:

Create a function to save a json into s3, see: https://medium.com/founders-coders/image-uploads-with-aws-s3-elixir-phoenix-ex-aws-step-1-f6ed1c918f14 using a specific dependency to upload the image, we can check if this package is still up to date and create a function which uses it.

Both use ex-aws, but one is suggesting saving as .md files, the other as json files.

Cleop commented 6 years ago

Notes

ex-aws says: 'Elixir streams to automatically retrieve paginated resources.'

ExAws.S3.list_objects("my-bucket") |> ExAws.request

Getting error when trying to make the request:

I started following the image upload tutorial but leaving out the saving to the DB parts as I didn't want to open that area in this issue. I've got several questions etc which I've made comments for in my code and pushed so I can go through these with @SimonLab.

Cleop commented 6 years ago

This morning I worked with @SimonLab to try and debug my code from yesterday. We decided to proceed with my initial code from the ex-aws repo before we look at JC's code. Despite the repo implying that the config is done automatically: ExAws by default does the equivalent of..., we manually added in the config and the server now runs.

We then manually uploaded something to s3 as the command provided on s3:

With these deps you can use ExAws precisely as you're used to: # make a request (with the default region) ExAws.S3.list_objects("my-bucket") |> ExAws.request

sounds as though it could be listing rather than uploading resources and so there is something in our bucket to list.

We then logged the result of this command and get this:

We checked that the env vars have been set and they have. So next I'm going to look into why this may be occurring.

Couldn't find anything helpful/relevant on this train of thought so now working through the tutorial approach on a separate branch

I've done the whole tutorial from scratch, I've not changed the names of anything to adapt it to our needs. I have come across a number of places where the tutorial has had errors (mostly missing ends or brackets) but now I'm getting these two errors: I tried to remove the html function to see if it would work otherwise and that's when I get the second error.

The html fn error was fixed by adding a comma. @SimonLab suggested the second one could be fixed by changing the word model to schema to account for the new Phoenix version however this has not worked. I'm parking this until @SimonLab and I can pair on it.

SimonLab commented 6 years ago

I'll need to update manually elixir to be able to use aws with Elixir

try asdf to manage elixir versions: https://gist.github.com/rubencaro/6a28138a40e629b06470

iteles commented 6 years ago

This is probably where learnings from this issue should go 😊 https://github.com/dwyl/learn-amazon-web-services

Cleop commented 6 years ago

@nelsonic @SimonLab - once we get this up and running, do we want to save our files as .md or json files?

nelsonic commented 6 years ago

@Cleop good question I've been thinking about this ... We can make the issue comment history "lookup" both faster and cheaper if we use .json specifically the file should be {issue_id}.json and we should use the format:

{
  "comment_revision_id": "escaped rich text",
  "comment_revision_id": "escaped rich text, etc ..."
}

This way we only need to retrieve a single file from S3 and display it on a page. #23

SimonLab commented 6 years ago

After creating new credentials for my account on aws I can access the bucket inforamtion:

This work without the bucket name environment variable, which make me think that the information might just be the plublic one (need to check that) However with the bucket name in the .env I still have the same error:

nelsonic commented 6 years ago

@SimonLab so you are getting a 404 error when attempting to view the file you uploaded?

SimonLab commented 6 years ago

@nelsonic the 404 error is a conflict on the configuration of ex-aws I haven't try yet to download the file (the url is not displayed when I manage to get some basic information). We need to make sure first that the configuration of the ex-aws is done properly and understand where to use the environment varialbe (in the config or directly on the ex-aws functions calls)

nelsonic commented 6 years ago

@SimonLab which branch are you working on? (I only see this issue referenced in Cleo's commits...) I can try and take a look if you need a "second pair of eyes"...

SimonLab commented 6 years ago

I'm working on https://github.com/dwyl/github-backup/tree/save-to-s3. At the moment I'm having a better look at the ex-aws code to try to undersrtand how the requests are build. I'm using this config: https://github.com/dwyl/github-backup/blob/5dc3126b867b88cff6c2fdbd2d5a4bdbe1ca4020/config/config.exs#L32-L39

Then on the on the get_files_buckets: https://github.com/dwyl/github-backup/blob/5dc3126b867b88cff6c2fdbd2d5a4bdbe1ca4020/lib/app_web/controllers/aws/s3.ex#L17 if you pass nil has a parameter to the function you should be able to get some information logged in on the terminal (we are calling the function on the home page at the moment: https://github.com/dwyl/github-backup/blob/5dc3126b867b88cff6c2fdbd2d5a4bdbe1ca4020/lib/app_web/controllers/page_controller.ex#L7) But the documentation ex-aws is not clear and they suggest to pass the name of the bucket to the function but this return a 404 error.

I'm wondering if ex-aws is necessary, maybe we could simplify the request to aws by just using httpoison for example

documentation: https://hexdocs.pm/ex_aws_s3/ExAws.S3.html

SimonLab commented 6 years ago

I still can't find out why s3.list_objects with the correct bucket name as a parameter return a 404 NoSuchKey. I start to think it might be an issue with the way the bucket has been created on s3? @nelsonic I'd like to create another bucket to see if I still have the same error, is it ok or do we need to pay more for this?

Otherwise listing all the items of the bucket is not really a function we will use so I might try to upload a file to the bucket first and try to get the url of this file. Then try to read the file with our current credentials

nelsonic commented 6 years ago

@SimonLab you can create unlimited buckets without incurring any cost (provided you don't store large files in them...)

SimonLab commented 6 years ago

I have a similar error with a new created bucket: Which let me think that the error is due to our code instead of aws directly

SimonLab commented 6 years ago

By defining the region directly into the S3 function call I manage to get the list of items and to create a new item in a bucket:

S3.put_object("sauvegardedata","test.json", "{\"yo\": \"yeaaa\"}") |> ExAws.request(region: "eu-west-2") |> IO.inspect

The configuration contains only the basic credentials:

config :ex_aws,
  access_key_id: System.get_env("AWS_ACCESS_KEY_ID"),
  secret_access_key: System.get_env("AWS_SECRET_ACCESS_KEY")

Cleop commented 6 years ago

Found this article with some nice documentation: https://alexgaribay.com/2017/01/20/upload-files-to-s3-with-phoenix-and-ex_aws-2/

@SimonLab - do you think this discussion around public/private permissions could be to do with the list fn 404 error? https://github.com/ex-aws/ex_aws/issues/364 Although you'd think it might be a 403 error if that was the case.

Cleop commented 6 years ago

@SimonLab - we have got this working on the save-to-s3 branch. Are we in a position to create a PR for this or is there anything blocking us from doing this?

Cleop commented 6 years ago

@SimonLab - would you like to close this issue as you created it and it's a technical one?

SimonLab commented 6 years ago

The text of the comments is now saved in S3 with the following stringify json format: {version_id: "text comment", ...}

dwyl / github-backup

Saving Comment Richtext Data to S3 #31

Notes