edasque / DynamoDBtoCSV

Dump DynamoDB data into a CSV file
Apache License 2.0
470 stars 152 forks source link

Is the script really recognize CLI options? #65

Open simonNozaki opened 3 years ago

simonNozaki commented 3 years ago

I wonder whether there are some lack of codes on dynamoDBtoCSV.js.

I cloned this repository, thinking "Wow, this is what I exactly wanted". And I ran the scripts along the instruction on README.

But there are so much "undefined"s on running it. As I inspected the code, maybe I think that lacks the some instances. Adding the code below and modifying something related, It finally ran properly:

const options = program.opts();

I saw that the official instruction of the library "commander" tells us that we have to add the instance above. https://www.npmjs.com/package/commander

If the scripts has some insufficient components, I will make a pull request.

Please ensure the real specification of this script.

Thanks!

ArFe commented 3 years ago

I wonder whether there are some lack of codes on dynamoDBtoCSV.js.

I cloned this repository, thinking "Wow, this is what I exactly wanted". And I ran the scripts along the instruction on README.

But there are so much "undefined"s on running it. As I inspected the code, maybe I think that lacks the some instances. Adding the code below and modifying something related, It finally ran properly:

const options = program.opts();

I saw that the official instruction of the library "commander" tells us that we have to add the instance above. https://www.npmjs.com/package/commander

If the scripts has some insufficient components, I will make a pull request.

Please ensure the real specification of this script.

Thanks!

I run into the same issue. The older version used to work though. I think commander changed. No sure if the owner is still active with this.

edasque commented 3 years ago

I am.

I am in the process of re-building this, not quite from scratch but doing a deep cleanup of that code. I am guessing we didn't pin down the version of commander enough. Watch this space for updates.

E.D.

edasque commented 3 years ago

If I were you, I would try treplacing the "commander": ">=2.2.0", line in package.json with something like "commander": "^2.2.0",

ArFe commented 3 years ago

this would do. added this line (after .parse(process.argv) ): const options = program.opts();

And replace program by options thereafter.

On Mon, 29 Mar 2021 at 14:27, Erik Dasque @.***> wrote:

I'd try to replacing the "commander": ">=2.2.0", line in package.json with something like "commander": "^2.2.0",

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/edasque/DynamoDBtoCSV/issues/65#issuecomment-809610043, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCUFVMDXEMZPYTE4YAWADTTGDBALANCNFSM4ZYREXBA .

const program = require("commander"); const AWS = require("aws-sdk"); const unmarshal = require("dynamodb-marshaler").unmarshal; const Papa = require("papaparse"); const fs = require("fs"); const { stringify } = require("querystring");

let headers = []; let unMarshalledArray = [];

program .version("0.1.1") .option("-t, --table [tablename]", "Add the table you want to output to csv") .option("-i, --index [indexname]", "Add the index you want to output to csv") .option("-k, --keyExpression [keyExpression]", "The name of the partition key to filter results on") .option("-v, --keyExpressionValues [keyExpressionValues]", "The key value expression for keyExpression. See: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html") .option("-c, --count", "Only get count, requires -k flag") .option("-a, --stats [fieldname]", "Gets the count of all occurances by a specific field name (only string fields are supported") .option("-d, --describe", "Describe the table") .option("-S, --select [select]", "Select specific fields") .option("-r, --region [regionname]") .option( "-e, --endpoint [url]", "Endpoint URL, can be used to dump from local DynamoDB" ) .option("-p, --profile [profile]", "Use profile from your credentials file") .option("-m, --mfa [mfacode]", "Add an MFA code to access profiles that require mfa.") .option("-f, --file [file]", "Name of the file to be created") .option( "-ec --envcreds", "Load AWS Credentials using AWS Credential Provider Chain" ) .option("-s, --size [size]", "Number of lines to read before writing.", 5000) .parse(process.argv);

const options = program.opts();

if (!options.table) { console.log("Table: " + process.argv); console.log("Table: " + stringify(options)); console.log("Table: " + options.table); console.log("You must specify a table"); options.outputHelp(); process.exit(1); }

if (options.region && AWS.config.credentials) { AWS.config.update({ region: options.region }); } else { AWS.config.loadFromPath(__dirname + "/config.json"); }

if (options.endpoint) { AWS.config.update({ endpoint: options.endpoint }); }

if (options.profile) { let newCreds = new AWS.SharedIniFileCredentials({ profile: options.profile }); newCreds.profile = options.profile; AWS.config.update({ credentials: newCreds }); }

if (options.envcreds) { let newCreds = AWS.config.credentials; newCreds.profile = options.profile; AWS.config.update({ credentials: { accessKeyId: process.env.AWS_ACCESS_KEY_ID, secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY }, region: process.env.AWS_DEFAULT_REGION }); }

if (options.mfa && options.profile) { const creds = new AWS.SharedIniFileCredentials({ tokenCodeFn: (serial, cb) => {cb(null, options.mfa)}, profile: options.profile });

// Update config to include MFA AWS.config.update({ credentials: creds }); } else if (options.mfa && !options.profile) { console.log('error: MFA requires a profile(-p [profile]) to work'); process.exit(1); }

const dynamoDB = new AWS.DynamoDB();

const query = { TableName: options.table, IndexName: options.index, Select: options.count ? "COUNT" : (options.select ? "SPECIFIC_ATTRIBUTES" : (options.index ? "ALL_PROJECTED_ATTRIBUTES" : "ALL_ATTRIBUTES")), KeyConditionExpression: options.keyExpression, ExpressionAttributeValues: JSON.parse(options.keyExpressionValues), ProjectionExpression: options.select, Limit: 1000 };

const scanQuery = { TableName: options.table, IndexName: options.index, Limit: 1000 };

// if there is a target file, open a write stream if (!options.describe && options.file) { var stream = fs.createWriteStream(options.file, { flags: 'a' }); } let rowCount = 0; let writeCount = 0; let writeChunk = options.size;

const describeTable = () => { dynamoDB.describeTable( { TableName: options.table }, function (err, data) { if (!err) { console.dir(data.Table); } else console.dir(err); } ); };

const scanDynamoDB = (query) => { dynamoDB.scan(query, function (err, data) { if (!err) { unMarshalIntoArray(data.Items); // Print out the subset of results. if (data.LastEvaluatedKey) { // Result is incomplete; there is more to come. query.ExclusiveStartKey = data.LastEvaluatedKey; if (rowCount >= writeChunk) { // once the designated number of items has been read, write out to stream. unparseData(data.LastEvaluatedKey); } scanDynamoDB(query); } else { unparseData("File Written"); } } else { console.dir(err); } }); };

const appendStats = (params, items) => { for (let i = 0; i < items.length; i++) { let item = items[i]; let key = item[options.stats].S;

if (params.stats[key]) {
  params.stats[key]++;
} else {
  params.stats[key] = 1;
}

rowCount++;

} }

const printStats = (stats) => { if (stats) { console.log("\nSTATS\n----------"); Object.keys(stats).forEach((key) => { console.log(key + " = " + stats[key]); }); writeCount += rowCount; rowCount = 0; } }

const processStats = (params, data) => { let query = params.query; appendStats(params, data.Items); if (data.LastEvaluatedKey) { // Result is incomplete; there is more to come. query.ExclusiveStartKey = data.LastEvaluatedKey; if (rowCount >= writeChunk) { // once the designated number of items has been read, print the final count. printStats(params.stats); } queryDynamoDB(params); } };

const processRows = (params, data) => { let query = params.query; unMarshalIntoArray(data.Items); // Print out the subset of results. if (data.LastEvaluatedKey) { // Result is incomplete; there is more to come. query.ExclusiveStartKey = data.LastEvaluatedKey; if (rowCount >= writeChunk) { // once the designated number of items has been read, write out to stream. unparseData(data.LastEvaluatedKey); } queryDynamoDB(params); } else { unparseData("File Written"); } };

const queryDynamoDB = (params) => { let query = params.query; dynamoDB.query(query, function (err, data) { if (!err) { if (options.stats) { processStats(params, data); } else { processRows(params, data); } } else { console.dir(err); } }); };

const unparseData = (lastEvaluatedKey) => { var endData = Papa.unparse({ fields: [...headers], data: unMarshalledArray }); if (writeCount > 0) { // remove column names after first write chunk. endData = endData.replace(/(.*\r\n)/, "");; } if (options.file) { writeData(endData); } else { console.log(endData); } // Print last evaluated key so process can be continued after stop. console.log("last key:"); console.log(lastEvaluatedKey);

// reset write array. saves memory unMarshalledArray = []; writeCount += rowCount; rowCount = 0; }

const writeData = (data) => { stream.write(data); };

const unMarshalIntoArray = (items) => { if (items.length === 0) return;

items.forEach(function (row) { let newRow = {};

// console.log( 'Row: ' + JSON.stringify( row ));
Object.keys(row).forEach(function (key) {
  if (headers.indexOf(key.trim()) === -1) {
    // console.log( 'putting new key ' + key.trim() + ' into headers ' + headers.toString());
    headers.push(key.trim());
  }
  let newValue = unmarshal(row[key]);

  if (typeof newValue === "object") {
    newRow[key] = JSON.stringify(newValue);
  } else {
    newRow[key] = newValue;
  }
});

// console.log( newRow );
unMarshalledArray.push(newRow);
rowCount++;

}); }

if (options.describe) describeTable(scanQuery); if (options.keyExpression) queryDynamoDB({ "query": query, stats: {} }); else scanDynamoDB(scanQuery);

simonNozaki commented 3 years ago

@edasque

Thanks for replying, and I can understand the status of the repository.

If I were you, I would try treplacing the "commander": ">=2.2.0", line in package.json with something like "commander": "^2.2.0", That is true, I could not be aware of it...

OK, I will watch this repository continuously.

caiconkhicon commented 3 years ago

Hi, I just bumped into this tool today, and the fix in this issue helped me run the program. However, it seems that the argument "--select" is not working. I am not sure if it is related or not. Can anyone please try it? Thanks

simonNozaki commented 3 years ago

However, it seems that the argument "--select" is not working. I am not sure if it is related or not. Can anyone please try it? Thanks

We cannot approach to that problem only by given the info. You have better to check or debug that CLI option by printing the code around here:

const query = {
  TableName: options.table,
  IndexName: options.index,
  Select: program.count ? "COUNT" : (program.select ? "SPECIFIC_ATTRIBUTES" : (program.index ? "ALL_PROJECTED_ATTRIBUTES" : "ALL_ATTRIBUTES")),  // <== maybe this is not working properly
  KeyConditionExpression: program.keyExpression,
  ExpressionAttributeValues: JSON.parse(options.keyExpressionValues),
  ProjectionExpression: program.select,
  Limit: 1000
};
ArFe commented 3 years ago

Hi, I just bumped into this tool today, and the fix in this issue helped me run the program. However, it seems that the argument "--select" is not working. I am not sure if it is related or not. Can anyone please try it? Thanks

I just tested and it worked for me. Must be something else or the way you're using select... if you post the error, we might be able to help.

caiconkhicon commented 3 years ago

Hi, I just bumped into this tool today, and the fix in this issue helped me run the program. However, it seems that the argument "--select" is not working. I am not sure if it is related or not. Can anyone please try it? Thanks

I just tested and it worked for me. Must be something else or the way you're using select... if you post the error, we might be able to help.

Sure. It is not an error, but the output still has all attributes, while I only want to select some. Btw, I also add -v "{}" to every command to make it work as suggested here (https://github.com/edasque/DynamoDBtoCSV/issues/64#issuecomment-696985593). For example: node dynamoDBtoCSV.js -t userPool -p dev -v "{}" --select "username, id" -m 123456

If I use -d, it still prints out all rows before printing out the description. node dynamoDBtoCSV.js -t userPool -p dev -v "{}" -d -m 123456

ArFe commented 3 years ago

I'm not sure what it is yet, but it has to do with the -v "{}". Try something like this instead of the above: -k "primarypartitionkey = :v1" -v '{\":v1\": {\"S\": \"primarypartitionkeyvalue\"}}' It's going to select only the specific primary partition key selected, but the --select will work properly. Not sure if it helps though.

I will take a closer look to see why the -v "{}" does not work with the --select.

ArFe commented 3 years ago

I just figure out the problem. When there is no -k parameters (options.keyExpression is null), it uses the Scan Query, without any parameters...

if (options.describe) describeTable(scanQuery); if (options.keyExpression) queryDynamoDB({ "query": query, stats: {} }); else scanDynamoDB(scanQuery);

`const query = { TableName: options.table, IndexName: options.index, Select: options.count ? "COUNT" : (options.select ? "SPECIFIC_ATTRIBUTES" : (options.index ? "ALL_PROJECTED_ATTRIBUTES" : "ALL_ATTRIBUTES")), KeyConditionExpression: options.keyExpression, ExpressionAttributeValues: JSON.parse(options.keyExpressionValues), ProjectionExpression: options.select, Limit: 1000 };

const scanQuery = { TableName: options.table, IndexName: options.index, Limit: 1000 };`

If you change the scan query to below, it will work. I will add to the pull request:

const scanQuery = { TableName: options.table, IndexName: options.index, ProjectionExpression: options.select, Limit: 1000 };

caiconkhicon commented 3 years ago

@ArFe : Thanks for your help. I will try it today and tell you the result.