aws-solutions / aws-data-lake-solution

A deployable reference implementation intended to address pain points around conceptualizing data lake architectures that automatically configures the core AWS services necessary to easily tag, search, share, and govern specific subsets of data across a business or with other external businesses.
https://aws.amazon.com/solutions/implementations/data-lake-solution/
Apache License 2.0
401 stars 160 forks source link

package ids beginning with dash cant be deleted properly #34

Closed jgc234 closed 4 years ago

jgc234 commented 5 years ago

I've run into an issue where package IDs starting with '-' can't be fully deleted. Elasticsearch barfs which results in the package being deleted, but still searchable in a list.

from version 2.1.0

2019-10-19T10:44:25.506Z    678bd893-cc8c-49dd-b4c1-b6de3c98da68    { body: { package_id: '-n0RdiWKK' },
resource: '/search/index',
httpMethod: 'DELETE',
headers: ......

path: '/data-lake/_search',
query: { q: 'package_id:-n0RdiWKK' },

displayName: 'BadRequest',
message: '[parse_exception] parse_exception: Encountered " "-" "- "" at line 1, column 11.\nWas expecting one of

I assume this could be fixed by specifying a more limited 64-char set for shortid (in content-package.js:125 ) with shortid.characters(....).. or maybe by escapting/quoting the package_id when building the elasticsearch query (if that's feasible).

If you bulk-delete via the CLI, you'll be left only with packages beginning with '-', so I assume a dash in the middle still works OK.

beomseoklee commented 5 years ago

Thanks for your input, and I'm sorry for your inconvenience. We will consider this one in the next release.

georgebearden commented 4 years ago

The shortid character set has been updated to remove ‘-‘ in the latest commit. Please let us know if you need additional assistance.