amazon-archives / logstash-input-dynamodb

This input plugin for Logstash scans a specified DynamoDB table and then reads changes to a DynamoDB table from the associated DynamoDB Stream.This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline. This gem is not a stand-alone program.
Apache License 2.0
105 stars 27 forks source link

DynamoDB-Local support #10

Open pewallin opened 8 years ago

pewallin commented 8 years ago

Seems like this plugin is not working with dynamodb-local? The region is being parsed from the endpoint which of course will not work with http://localhost:8000. I'm having trouble compiling the plugin; otherwise it looks like a fairly simple fix on line 174 in dynamodb.rb. Maybe allow a config option "dynamodb-local_region" or something similar?

marcosnils commented 8 years ago

@pewallin I made it work with dynamo local supplying a custom regions.xml to logstash with the follwing parameter JAVA_OPTS=-Dcom.amazonaws.regions.RegionUtils.fileOverride=/config/regions.xml

the snippet that I added into the file is the following:

    <Region>
      <Name>dynamodb</Name>
      <Endpoint>
        <ServiceName>dynamodb</ServiceName>
        <Http>true</Http>
        <Https>false</Https>
        <Hostname>dynamodb</Hostname>
      </Endpoint>
      <Endpoint>
        <ServiceName>streams.dynamodb</ServiceName>
        <Http>true</Http>
        <Https>false</Https>
        <Hostname>dynamodb</Hostname>
      </Endpoint>
    </Region>

NOTE: Change the <hostname> as needed.

You can now then configure the plugin as follows:

input {
  dynamodb {
    table_name => "YOUR_TABLE"
    endpoint => "http://dynamodb:7777"
    streams_endpoint => "http://dynamodb:7777"
    perform_scan => true
    view_type => "new_image"
    aws_access_key_id => "blabla"
    aws_secret_access_key => "blabla"
  }
}
pewallin commented 8 years ago

Thanks @marcosnils - that worked!

marcosnils commented 8 years ago

@pewallin plz close?

pewallin commented 8 years ago

@marcosnils I think this merits at least a readme update before closing? Also, it looks like this repo is not what is currently published on Rubygems (which has newer versions) so I'm hesitant to do a PR.

marcosnils commented 8 years ago

@pewallin Im not the repo owner or contributor so I can't take any actions. Regarding what's on RubyGems I'm the who published a custom version of the Gems there as I need them for something else.

vaneavasco commented 8 years ago

I followed all the steps you described and I get:

--- jar coordinate commons-logging:commons-logging already loaded with version 1.1.3 - omit version 1.2
--- jar coordinate org.apache.httpcomponents:httpcore already loaded with version 4.4.4 - omit version 4.4.1
--- jar coordinate org.apache.httpcomponents:httpclient already loaded with version 4.5.2 - omit version 4.5
log4j:WARN No appenders could be found for logger (org.apache.http.client.protocol.RequestAuthCache).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
{:timestamp=>"2016-03-31T14:34:30.056000+0000", :message=>"Connection refused", :class=>"Manticore::SocketException", :level=>:error}
Error: AWS credentials invalid or not found in the provider chain
Invalid StreamArn (Service: AmazonDynamoDBStreams; Status Code: 400; Error Code: ValidationException; Request ID: 2a7a91b0-926b-4aec-aa89-d047140fa42c)
You may be interested in the '--configtest' flag which you can
use to validate logstash's configuration before you choose
to restart a running system.

Any ideas? I think it's something wrog with the region, there's some ARN mismatch.

marcosnils commented 8 years ago

are you sending the regions.xml file?

vaneavasco commented 8 years ago

Yes, I am, I followed the same steps as above, only in the config file I use the ip for the end point, and the table in dynamodb has the same region as the one defined in the regions (I cheched the table's ARN). The strangest thing is that if I provide an invalid table name I'm notified that the table name is invalid.

The regions file is parsed, I've made a small exepriment and if I add the port to the hostname it won't be parsed anymore + an error is returned.

  <Region>
      <Name>dblocal</Name>
      <Endpoint>
        <ServiceName>dynamodb</ServiceName>
        <Http>true</Http>
        <Https>false</Https>
        <Hostname>192.168.1.99</Hostname>
      </Endpoint>
      <Endpoint>
        <ServiceName>streams.dynamodb</ServiceName>
        <Http>true</Http>
        <Https>false</Https>
        <Hostname>192.168.1.99</Hostname>
      </Endpoint>
    </Region>

And my config:

input {
  dynamodb {
    table_name => "mytable"
    endpoint => "http://192.168.1.99:7777"
    streams_endpoint => "http://192.168.1.99:7777"
    perform_scan => true
    view_type => "new_image"
    aws_access_key_id => "mykey"
    aws_secret_access_key => "mysecret"
  }
}
marcosnils commented 8 years ago

@stefanvasco I'm not sure but I believe the endpoint might have something to do with the region / arn. Can't you use the same example as I posted before?

vaneavasco commented 8 years ago

Thx, I'll open a new issue, even with my amazon account I'm getting the same error.

yomansk8 commented 7 years ago

Hi, I'm facing the same issue but in a docker container. So I tried your solution by adapting it in my dockerfile, but it don't work :

ENV LS_JAVA_OPTS="-D:com.amazonaws.regions.RegionUtils.fileOverride=regions.xml"

My regions.xmlfile :

<Region>
  <Name>dynamodb</Name>
  <Endpoint>
    <ServiceName>dynamodb</ServiceName>
    <Http>true</Http>
    <Https>false</Https>
    <Hostname>localhost</Hostname>
  </Endpoint>
  <Endpoint>
    <ServiceName>streams.dynamodb</ServiceName>
    <Http>true</Http>
    <Https>false</Https>
    <Hostname>localhost</Hostname>
  </Endpoint>
</Region>

My config :

input { 
    dynamodb {
      endpoint => "http://localhost:9000" 
      streams_endpoint => "http://localhost:9000" 
      view_type => "new_and_old_images" 
      aws_access_key_id => "AKIAJWTE4NAGOHIIMHSQ" 
      aws_secret_access_key => "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" 
      table_name => "MusicCollection"
    } 
} 

Anyone has an idea ?

mikebrules commented 7 years ago

Same issue for me; it works okay using a hostname as @marcosnils specified in the initial example (using "dynamodb" as the endpoint) - but of course that needs to mapped (to my host dynamodb ) inside docker (using add-host). As soon as I do this, I get this,

{:timestamp=>"2017-03-06T23:29:48.807000+0000", :message=>"Pipeline aborted due to error", :exception=>java.lang.IllegalArgumentException: No region found with any service for endpoint http://dynamodb:8000,

If I leave out the add-host statement, logstash starts, but of course can't resolve dynamodb as it has no entry in the hosts file in docker......

Can anyone help - seems to be impossibly hard to run this locally....

marcosnils commented 7 years ago

@mikebrules can you provide an example about how you're running the container and what do you get in the logs exactly?

mikebrules commented 7 years ago

@marcosnils wow, thanks, quick! Sure...

Here is my regions config

 <Region>
    <Name>dynamodb</Name>
    <Endpoint>
        <ServiceName>dynamodb</ServiceName>
        <Http>true</Http>
        <Https>false</Https>
        <Hostname>dynamodb</Hostname>
    </Endpoint>
    <Endpoint>
        <ServiceName>streams.dynamodb</ServiceName>
        <Http>true</Http>
        <Https>false</Https>
        <Hostname>dynamodb</Hostname>
    </Endpoint>
</Region>

and my logstash input

  dynamodb{
    endpoint => "http://dynamodb:8000"
    streams_endpoint => "http://dynamodb:8000"
    view_type => "new_image"
    perform_scan => true
    aws_access_key_id => "dsadsa"
    aws_secret_access_key => "dsadsa"
    table_name => "dev-blocks"}
  }
  filter {
      dynamodb {}
  }
  output {
      elasticsearch
          { host => "10.0.2.15:9200" }
        stdout { }
  }`

because I now have to add "dynamodb" to my docker hosts, I do this...

  docker run --add-host 'dynamodb:10.0.2.15' 'base-logstash' -e ......

and that throws this...

  {:timestamp=>"2017-03-06T23:49:46.564000+0000", :message=>"Pipeline aborted due to error", :exception=>java.lang.IllegalArgumentException: No region found with any service for endpoint http://dynamodb:8000, :backtrace=>["com.amazonaws.regions.AbstractRegionMetadataProvider.getRegionByEndpoint(com/amazonaws/regions/AbstractRegionMetadataProvider.java:41)", "com.amazonaws.regions.RegionMetadata.getRegionByEndpoint(com/amazonaws/regions/RegionMetadata.java:90)", "com.amazonaws.regions.RegionUtils.getRegionByEndpoint(com/amazonaws/regions/RegionUtils.java:123)"

which ( I presume) mean it IS finding my dynamodb, but says no region found? I would rather just wire in the IP address in my regions file, but I get the same issue......must be something to do with the region my dynamo local is running?

Thanks loads for the help...

marcosnils commented 7 years ago

@mikebrules can you please show me the complete docker run command you're using?. You need to send env variables and also mount the config files to work. Also, which docker image are you using?

mikebrules commented 7 years ago

@marcosnils sure - thanks again for your help

  docker run --add-host 'dynamodb:10.0.2.15' 'base-logstash' -e 'input {
  dynamodb{
    endpoint => "http://dynamodb:8000"
    streams_endpoint => "http://dynamodb:8000"
    view_type => "new_image"
    perform_scan => true
    aws_access_key_id => "dsadsa"
    aws_secret_access_key => "dsadsa"
    table_name => "dev-blocks"}
  }
  filter {
      dynamodb {}
  }
  output {
      elasticsearch
          { host => "dynamodb:9200" }
        stdout { }
  }'

Dockerfile

  FROM logstash:2.3

  RUN mkdir -p /config
  COPY ./src/regions.xml /config
  ENV JAVA_OPTS= -Jcom.amazonaws.regions.RegionUtils.fileOverride=/config/regions.xml

  # Make .m2 accessible to logstash user, otherwise logstash won't start
  RUN mkdir -p /var/lib/logstash/.m2
  RUN ln -s /var/lib/logstash/.m2 /root/.m2

  ENV PATH /opt/logstash/vendor/jruby/bin/:$PATH

  RUN gem install logstash-input-dynamodb:'> 2' logstash-filter-dynamodb:'> 2'
  RUN plugin install logstash-input-dynamodb logstash-filter-dynamodb

Note: I seem to have to use '-J' for the JAVA_OPTS

mikebrules commented 7 years ago

More detail @marcosnils - I'm sure this is because the plugin can't find my regions file that I am creating. I've checked it is there by execing into the running container, and tried setting every conceivable ENV var as I start the container (JAVA_OPTS, LS_JAVA_OPTS, both with -J or -D args)

  docker run  -it 'base-logstash' -e "LS_JAVA_OPTS=-Jcom.amazonaws.regions.RegionUtils.fileOverride=/config/regions/regions.xml" -e "JAVA_OPTS=-Jcom.amazonaws.regions.RegionUtils.fileOverride=/config/regions/regions.xml" -e "LS_JAVA_OPTS=-Dcom.amazonaws.regions.RegionUtils.fileOverride=/config/regions/regions.xml" -e "JAVA_OPTS=-Dcom.amazonaws.regions.RegionUtils.fileOverride=/config/regions/regions.xml" -e 'input {
  dynamodb{
    endpoint => "http://10.0.2.15:8000"
    streams_endpoint => "http://10.0.2.15:8000"
    view_type => "new_image"
    perform_scan => true
    aws_access_key_id => "dsadsa"
    aws_secret_access_key => "dsadsa"
    table_name => "dev-blocks"}
  }
  filter {
      dynamodb {}
  }
  output {
      elasticsearch
          { host => "10.0.2.15:9200" }
        stdout { }
  }'

Still the same error -

  Pipeline aborted due to error {:exception=>java.lang.IllegalArgumentException: No region found with any service for endpoint http://10.0.2.15:8000, :backtrace=>["com.amazonaws.regions.AbstractRegionMetadataProvider.getRegionByEndpoint(com/amazonaws/regions/AbstractRegionMetadataProvider.java:41)", "com.amazonaws.regions.RegionMetadata.getRegionByEndpoint(com/amazonaws/regions/RegionMetadata.java:90)", "com.amazonaws.regions.RegionUtils.getRegionByEndpoint(com/amazonaws/regions/RegionUtils.java:123)"

Any ideas? Pulling my hair out? Thanks in advance?

marcosnils commented 7 years ago

@mikebrules this is how I'm running y image (through compose)

logstash:
    image: mantika/logstash-dynamodb-streams:2
    command: logstash -f /config/logstash.conf
    links:
            - db:dynamodb
    volumes:
            - ./:/config/
    environment:
            - LS_JAVA_OPTS=-Dcom.amazonaws.regions.RegionUtils.fileOverride=/config/regions.xml
    restart: always

Observations:

mikebrules commented 7 years ago

@marcosnils thanks for getting back, I'll check all that...

mikebrules commented 7 years ago

@marcosnils many thanks - looking good!

goldcaddy77 commented 7 years ago

@marcosnils - just wanted to say thanks for all of the info in here. There isn't a ton of info about this on the web and reading through your posts ^^^ got me up and running 👍

marcosnils commented 7 years ago

@goldcaddy77 glad it helped out!.