edasque / DynamoDBtoCSV

Dump DynamoDB data into a CSV file
Apache License 2.0
470 stars 152 forks source link

Cannot find module 'commander' - a more detailed description in readme would be nice. #24

Closed michaelboecher closed 7 years ago

michaelboecher commented 7 years ago

I tried to run the script on a MAC OSX bash and have some issues. I'm not a developer and just try as a "noob" to transfer a DynamoDB JSON File in some CSV format to do some fancy excel stuff with it.

Problem 1) Get the script running.

Problem 2) When I'm able to run the script I guess a dynamoDB file export (via AWS Datapipeline of ~600MB) might be a bit too much?

Current error:

sh-3.2# node dynamoDBtoCSV.js --help module.js:472 throw err; ^

Error: Cannot find module 'commander' at Function.Module._resolveFilename (module.js:470:15) at Function.Module._load (module.js:418:25) at Module.require (module.js:498:17) at require (internal/module.js:20:19) at Object. (/Users/Machine/Downloads/Data/dynamoDBtoCSV.js:1:77) at Module._compile (module.js:571:32) at Object.Module._extensions..js (module.js:580:10) at Module.load (module.js:488:32) at tryModuleLoad (module.js:447:12) at Function.Module._load (module.js:439:3) sh-3.2#

my question: what is "commander", where can i get and how to install it? Thanks for support.

edasque commented 7 years ago

Start with npm install then run the app as you have. Not sure how it will behave with that much data, we will see.

michaelboecher commented 7 years ago

Thx man. "npm install" worked - script is running. Let's see if the 600MB makes some trouble. I'll keep you posted. Thanks so far.

michaelboecher commented 7 years ago

Hey. It seems to run... and i guess it will take a while...

sh-3.2# node dynamoDBtoCSV.js -d -t ee-data { AttributeDefinitions: [ { AttributeName: 'url', AttributeType: 'S' } ], TableName: 'ee-data', KeySchema: [ { AttributeName: 'url', KeyType: 'HASH' } ], TableStatus: 'ACTIVE', CreationDateTime: 2016-10-10T12:30:51.774Z, ProvisionedThroughput: { LastIncreaseDateTime: 2017-04-27T14:38:38.370Z, NumberOfDecreasesToday: 0, ReadCapacityUnits: 100, WriteCapacityUnits: 100 }, TableSizeBytes: 626469773, ## ~600MB ItemCount: 40972, TableArn: 'arn:aws:dynamodb:eu-west-1:XXXXXXXXXXXX:table/ee-data', LatestStreamLabel: '2017-04-27T14:39:48.676', LatestStreamArn: 'arn:aws:dynamodb:eu-west-1:XXXXXXXXXXXX:table/ee-data/stream/2017-04-27T14:39:48.676' }

edasque commented 7 years ago

Is your red bandwidth at 100? That'll take forever.

michaelboecher commented 7 years ago

Yeah. The read capacity is 100 yes. You mean I should increase it?

Meanwhile the process was interrupted.

RangeError: Invalid string length at serialize (/Users/machine/Downloads/ee-data/node_modules/papaparse/papaparse.js:394:13) at Object.JsonToCsv [as unparse] (/Users/machine/Downloads/ee-data/node_modules/papaparse/papaparse.js:292:12) at Response. (/Users/machine/Downloads/ee-data/dynamoDBtoCSV.js:62:26) at Request. (/Users/machine/Downloads/ee-data/node_modules/aws-sdk/lib/request.js:360:18) at Request.callListeners (/Users/machine/Downloads/ee-data/node_modules/aws-sdk/lib/sequential_executor.js:105:20) at Request.emit (/Users/machine/Downloads/ee-data/node_modules/aws-sdk/lib/sequential_executor.js:77:10) at Request.emit (/Users/machine/Downloads/ee-data/node_modules/aws-sdk/lib/request.js:673:14) at Request.transition (/Users/machine/Downloads/ee-data/node_modules/aws-sdk/lib/request.js:22:10) at AcceptorStateMachine.runTo (/Users/machine/Downloads/ee-data/node_modules/aws-sdk/lib/state_machine.js:14:12) at /Users/machine/Downloads/ee-data/node_modules/aws-sdk/lib/state_machine.js:26:10

Invalid string length. Great. How to find out which string within these 40972 items is too big?... anyhow the script itself seems to work as expected. Now I've to find this line item with invalid string length.

Okay. So far so good. Thanks @edasque (Erik).

michaelboecher commented 7 years ago

Works! @edasque

edasque commented 7 years ago

So you ran it again or did it just continue on? How many records did you export?

michaelboecher commented 7 years ago

@edasque : At the end I'll exported 20.000 records. That's enough for first analysis. The biggest issue was to get rid of the "strings with 30.000 characters" in the dynamo DB which caused anytime an "exceed" error by try to export it as a CSV. Finally the script itself works perfect. Thanks mate.

edasque commented 7 years ago

What was wrong with those fields, you think? Too big?

michaelboecher commented 7 years ago

@edasque : Yes. DynamoDB itself had issues with strings longer than 30.000 characters.