Case sensitivity - Githubissues

Martii commented 9 years ago

Hello. I'm currently one of the owners over at OpenUserJS.org and we seem to be having a recurring issue trying to update our dependency of aws-sdk. It basically crashes our app continually.

Currently we are at 2.0.21 of aws-sdk which currently works as case insensitive and previously tried to upgrade to 2.0.23 and 2.0.30 but aws-sdk throws an exception with error message at this commit summary.

I've narrowed it down to what appears to be a case sensitivity issue with our routine. What happens is we have some usernames that are lowercased for (case insensitive username) database retrieval, such as jri (site ref with uppercase), and the Key object name value in aws-sdk appears to be case sensitive now. I've done a dirty hack (very non-efficient) on working around the error with the following snippet:

// Get the script
s3.getObject({ Bucket: bucketName, Key: installName }, function (aErr, aData) {
  if (aErr) {
    console.error('`s3.getObject` Error reading ' + installName);
  } else {
    aCallback(aScript, s3.getObject({ Bucket: bucketName, Key: installName }).createReadStream());
  }
});

... this clearly isn't the right approach we would like to take but we are wondering if there could be some more light shed on Key case sensitivity when used with getObject and if there is a routine that we can use with getObject to ignore casing altogether so it doesn't throw the exception?

Any assistance would be appreciated. Our MVC code for S3 is all contained simply/reduced in that particular file. Thank you in advance.

installName for one of the offending exception throwers is jri/Geocaching_Map_Enhancements.user.js

AdityaManohar commented 9 years ago

... routine that we can use with getObject to ignore casing altogether so it doesn't throw the exception.

Amazon S3 keys are (and have always been) case sensitive. So if you are using the SDK to search for a key that doesn't exist, S3 will return an error. This is not an error that the SDK is throwing.

You should ensure that your are always using consistent casing on keys. If your keys come from user input, you should normalize casing before storing or reading them from S3.

I suspect this is where you are saving an object to S3. If so, you could change this line:

s3.putObject({ Bucket: bucketName, Key: installName, Body: aBuf },

to

s3.putObject({ Bucket: bucketName, Key: caseInsensitive(installName), Body: aBuf },

This should probably resolve your issue. Let me know if you have other questions.

Martii commented 9 years ago

Amazon S3 keys are (and have always been) case sensitive.

So why does at least 2.0.15 and 2.0.21 consider Key to be case insensitive (e.g. this is a newer issue that cropped up here from https://github.com/aws/aws-sdk-js/blob/4a8494a583eb1e8e9b9e5eea8380f86c272f4158/lib/request.js#L32 and doesn't currently exist as an issue in 2.0.21. We've been live for almost a year now and around October 2014 is when things seemed to change with aws-sdk) ? e.g. where exactly in the project (of aws-sdk) was something changed?

It would be extremely useful to have some sort of blurb in the official spec for getObject, and related, to state that it is officially case sensitive (Instead of just "Key — (String)")... The other day I found several responses in a web search that stated the exact opposite which is why I needed clarification with this issue because some test results were not congruent with prior development answers across the internet for at least a year.

On another interesting note aws-sdk doesn't throw the exception on this platform but only on nodejitsu which I think is very peculiar. I redirected the development environment to be a local production here (using the same DB on Amazon) and not once in several hours of testing did aws-sdk throw the exception from line 32... which was a little weird.

My parallel alternate theory is that we are lower casing somewhere in our code and we'll need to track that commit down but that was probably sometime earlier this year... so that is along the lines of your recommendation. Thanks and we'll see how this goes.

AdityaManohar commented 9 years ago

So why does at least 2.0.15 and 2.0.21 consider Key to be case insensitive

The SDK does not mutate data that you pass in to your request. This is not something that has changed. The SDK does validate data that you pass in to your request against the service operation API, and if something seems out of order, it returns an error. It does not autocorrect data. This allows you to be in control of your data.

Amazon S3 uses objects to store data. You can use objects to store any arbitrary data, including files, but objects do not emulate the characteristics of a file system. Objects names are unique (FOO != foo). The Amazon S3 documentation on objects keys and metadata explains this further.

I hope this helps you identify and isolate the problem. Let me know if there's anything else I can help with.

Martii commented 9 years ago

The SDK does validate data that you pass in to your request against the service operation API, and if something seems out of order, it returns an error.

Aha so that's what appears to have changed in the SDK versions. Thanks for the detailed explanation. We're slowly getting this squared away. Thank you again. :)

lsegal commented 9 years ago

Aha so that's what appears to have changed in the SDK versions.

I don't actually think the parameter validation behavior is what's changed in your scenario, though the way we notify of errors does, which I think might be related. The error you are seeing from the getObject call is coming from the service, not any validation in the SDK itself. My guess is one of two things have happened:

The SDK was always returning this error but never throwing it, so your application never "crashed". This error likely always occurred, though, and just was never handled. Based on the getObject code you listed above, it's very possible that your createReadStream() had been returning an empty payload in these cases without knowing about it if the 'error' event in the stream wasn't properly handled.
Your code at some point lowercased keys when getting them, and this changed at some point in your codebase as well. I agree this seems unlikely given that reverting to 2.0.21 resolves this, but it might also be a culmination of other changes in the SDK that could have affected your code path.

Martii commented 9 years ago

Little bit of both probably plus possibly another... users can type directly in the addy bar https://openuserjs.org/users/jri and cause some foobarring. We made a correction somewhere around May 2014 to June 2014 for that casing use case and that's probably where things started getting non-normalized on the Key a little bit in the DB. I've used your https://github.com/aws/aws-sdk-js/issues/397#issuecomment-60807108 to address capturing it instead of having our app crash... this helps too in the dev environment with a net timeout we have always had... but we'll need to fix our DB before we can upgrade.

One more related question... Is there a direct piece of software (Nix) to manually make corrections to the DB or does it always have to be programmatically with S3?

lsegal commented 9 years ago

You can visit the S3 console to inspect your keys and ensure they are normalized, or you can use the AWS CLI to modify your S3 keys in a fairly intuitive interface (aws s3 mv s3://mybucket/KEY s3://mybucket/key -- see the mv operation).

Martii commented 9 years ago

I'll have to request sizzle to give me that access then... I'll pass that info on. Kewl and thanks.

lock[bot] commented 5 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

aws / aws-sdk-js

Case sensitivity #430