Better way of retrieving parameters with multiple values

odyniec commented 13 years ago

(For the record: I tried sending this to the mailing list yesterday, but it hasn't shown up, for whatever reason.)

A few days ago on IRC there was a discussion about parameters with multiple values (such as checkboxes submitted through forms) and the slightly inconvenient way of handling them depending on whether there's just one value, or more, e.g.:

my $val = params->{foo};
if (ref $val) { print $val; } else { print @$val; }

@bigpresh considered two possible ways to improve this:

Borrowing a solution from PHP, where parameters with names ending with [] are always returned as lists (ugly)
Implementing a new method, params_multi, which always returns a list of values (preferred)

I'd like to add a third option to this list. It's something I've used in a recent project -- I wrote the following wrapper for param:

sub param2 {
    return ref(defined($_ = param(@_)) ? $_ : []) && wantarray ?
        @{$_||[]} : $_;
}

The wrapper, when called in scalar context, returns the same value as param, but when called in list context, always returns an arrayref (including returning an empty arrayref when the parameter is undef). So, if you expect a single value of a parameter, you call it like param:

my $val = param2("foo");

And when you expect multiple values, you wrap the call in an arrayref:

my $val = [ param2("foo") ];

I think it has a slight advantage over the params/params_multi duo in that you can still use just one method to retrieve parameters, and it's the context that distinguishes between getting a single value and a list of values.

What do you think about making param itself work this way?

Here are some more examples of what the wrapper returns for specific query strings:

Query string: (empty)
---------------------
param("foo") = undef;
param2("foo") = undef;
[ param2("foo") ] = [];

Query string: ?foo=
-------------------
param("foo") = '';
param2("foo") = '';
[ param2("foo") ] = [  ''  ];

Query string: ?foo=42
---------------------
param("foo") = '42';
param2("foo") = '42';
[ param2("foo") ] = [  '42'  ];

Query string: ?foo=42&foo=99
----------------------------
param("foo") = [  '42',  '99'  ];
param2("foo") = [  '42',  '99'  ];
[ param2("foo") ] = [  '42',  '99'  ];

bigpresh commented 13 years ago

The big issue with that kind of clever DWIMmery is that it doesn't always DWYM.

For instance, my $foo = param('foo') - OK, you got a straight scalar, as you expected.

However, whack it in to a hash without really thinking:

my %stuff = (
    foo => param('foo'),
);

... all of a sudden, $stuff{foo} is an arrayref, rather than the scalar the user probably expected - this is the kind of thing that can really trip up newbies and frustrate them. (Even experienced Perl programmers are likely to fall for this one occasionally, especially if they're coding with dangerous levels of blood in their caffeine stream...)

odyniec commented 13 years ago

The problem isn't that bad, because when the parameter value is actually a scalar, the wrapper would still return it as a scalar, even in list context. With arrayref values, it might indeed lead to the problem that you described -- but that use case still conforms to the basic rule of using the wrapper: if you intend to get multiple values, wrap the call in an arrayref.

my %stuff = (
    foo => [ param2('foo') ],
);

odyniec commented 13 years ago

I should correct myself -- I originally wrote that "The wrapper, when called in scalar context, returns the same value as param, but when called in list context, always returns an arrayref", but this isn't completely true. Actually, the wrapper returns a dereferenced arrayref if the parameter value really is a ref. Otherwise, it just returns the value, whatever it is (so it returns a scalar, if the value is a scalar).

xsawyerx commented 12 years ago

I prefer not to include DWIMish stuff, especially stuff that might break existing code for someone and isn't very DWIM (some might expect an arrayref instead of array when it's multiple).

Sorry, but I'm voting this one down.

bigpresh commented 12 years ago

@xsawyerx Fair enough. What do you think to the idea of a param_multi or param_list or similar keyword, which always returns a list, regardless of how many times a param appeared? I'm not sure it's important enough to warrant a new DSL keyword, but on the other hand it would be kind of nice to be able to avoid the nastyness of e.g. my $things = param->{things}; $things = [ $things ] unless ref $things; or similar.

xsawyerx commented 12 years ago

We're entering the problematic area of DBIx::Class::ResultSet which has ->resultset and ->resultset_rs because sometimes it returns this, and sometimes that. Of course, we're entering it from the other end, seeing as for us it's currently always consistent and we're adding a method to make it consistently different. It's obviously not as bad.

Perhaps the cleanest way is indeed to provide a new keyword param_list or param_as_list, but the thought makes me cringe... I'm still open to the idea though, as long as it's not changing the current DSL keyword.

odyniec commented 12 years ago

@xsawyerx yeah, it does break backwards compatibility (in specific limited cases, but still), so I agree that introducting such a change in a minor version might not be the best idea.

Anyway, maybe this is something that could be considered for Dancer 2 -- the problem of single/multiple value parameters is a very common one and it should have a simple solution. I still think the solution that I proposed has the advantage of being simple, as it requires no new functions and no conditional statements, and it makes use of a basic language construct (scalar/list context) to distinguish between single and multiple value parameters. As for it being DWIMish, I think it's just a matter of clearly documenting how it should be used.

xsawyerx commented 12 years ago

@odyniec even though I haven't wrote anything in a month, I still had this in the back of my mind (and open in a Firefox tab :) and gone over the different options. I've discussed it with rafl and racke and sukria some more.

We've reached a decision that in Dancer 2 the params() keyword would give you only one value for a key instead of optionally multiple ones, and that we would use a specific new keyword to get the params as an array reference, all of them, including those containing just one value.

This would provide two completely deterministic methods of getting your information. You always know what comes out, no matter what the context is. The only thing we'll need to deal with is people saying they only get one value instead of multiple ones in params() and we'll show them the documentation for params_multi() or whatever we choose to call it.

odyniec commented 12 years ago

@xsawyerx, thanks for the update. I must say I still like my idea a bit better :-) I think both my approach and the params/params_multi duo require some documentation to allow the developers to understand how they should be used, but my idea is a little simpler and allows for shorter code constructs. That said, I understand your reasoning about making the two methods fully deterministic, and it will certainly be a significant improvement over what we have now.

shadowcat-mst commented 12 years ago

The list context "DWIM" (a) breaks things in TT, which always calls in list context, and (b) can really screw you when constructing hashes.

In fact, CGI.pm's usage of this is a leading cause of security problems and bugs in CGI.pm using scripts. Therefore, it's an awful idea, and odyniec, I recommend removing this from your code before you implement something exploitable by accident.

A better solution would be to have params() return a Hash::MultiValue object like Plack::Request does, which is not only backwards compatible but provides ->get and ->get_all methods which correspond to "always give me a single value" and "always give me an array".

odyniec commented 12 years ago

In slight defense of my idea, the hash construction problem isn't that bad, since the list context is only taken into account if the value being retrieved actually is a list. So, assuming there's a scalar parameter foo, you're safe to do this:

%hash = (
    foo => param2('foo')
);

You only get screwed when you do the same with something that is a list, as in that case you should have said foo => [ param2('foo') ] instead.

Anyway, I understand that this idea introduces potential problems if used inappropriately, and I'm not going to promote it any further.

I just looked up Hash::MultiValue and it does look well-suited for this. @xsawyerx, have you considered that?

xsawyerx commented 12 years ago

@odyniec I haven't considered Hash::MultiValue but it does sound like a possible way to handle the issue. I'll look into it further. Thanks.

shadowcat-mst commented 12 years ago

That's no defence of your idea at all.

That means that a user adding an extra foo value to the form can screw you up, because suddenly it is "actually a list".

The fact that you can't see this is exactly why your idea is utterly unsafe, and should not be used in production code - and why Catalyst considers our CGI.pm compat param() method to be legacy, because it's bloody dangerous.

odyniec commented 12 years ago

@shadowcat-mst right, I see the real problem now, and I feel silly for not having thought of that case before. Thanks for pointing that out.

shadowcat-mst commented 12 years ago

The fact that it needs pointing out is what makes the feature so scary - because that means the only other way you're likely to find out is the hard way. Which I suspect is how I found out.

There's nothing wrong with giving the user a gun when they ask for one, but it's best not to hand them one that's already loaded, with the safety off, so that it points at their foot without them even doing anything.

... especially when you're one of the users :)

PerlDancer / Dancer

Better way of retrieving parameters with multiple values #649