libwww-perl / WWW-Mechanize

Handy web browsing in a Perl object
https://metacpan.org/pod/WWW::Mechanize
Other
68 stars 52 forks source link

Fix the handling of fields of type "file" #249

Closed gilmagno closed 2 years ago

gilmagno commented 6 years ago

Hi

I've found some inconsistencies in Mech's processing of fields of type "file". The data I'm giving to &field and to &set_fields is not being honored. I'm using them according to &submit_form's documentation.

These

$mech->field('document', ['/tmp/file.txt', 'the_file.txt']);
$mech->set_fields('document' => ['/tmp/file.txt', 'the_file.txt']);

won't produce the expected result, i.e., a HTTP request body that contains filename="the_file.txt".

I worked on the problem and found out that Mech is using a HTML::Form's API that has a bug and is considered "legacy" (according to a comment in HTML::Form's code). This causes problems to Mech.

But Mech's own use of the API also has a bug: when the field type is "file", Mech is not handling its value correctly.

The attached patch makes Mech use the current API and adds documentation to &set_fields and &field.

It's worth noting that &set_fields was doing (multiple times) what &field does, but was not reutilizing &field. The patch fixes this.

petdance commented 6 years ago

I don't have anything to say one way or the other on these. File uploads are alien to me at this level.

oalders commented 6 years ago

@petdance thanks. I didn't actually git blame, but I thought this might have originally been your work. Thanks for taking a look!

oalders commented 6 years ago

Note to self, check if $mech->submit_form(with_fields => { logfile => [ [ undef, 'whatever', Content => $content ], 1 ] } ); is specifically tested.

simbabque commented 2 years ago

I am rebasing and fixing this up now.

There is one large issue.

Note to self, check if $mech->submit_form(with_fields => { logfile => [ [ undef, 'whatever', Content => $content ], 1 ] } ); is specifically tested.

This does not work. When the first arg is undef, no content is sent at all. I am investigating and will create a draft PR with my current progress soon.

simbabque commented 2 years ago

There's a line in HTML::Form that stops the [undef, $filename, Content => 'content'] version to work.

    elsif (!defined($file) || length($file) == 0) {
  return;
   }

This is the non-legacy API mentioned in the original PR. I will amend this so we can have the undef if there is content.

I will also merge the OP's PR to HTML::Form to fix the legacy bug they found at the time (still on the original HTML::Form repo, gisle/html-form#10 and subsequently libwww-perl/HTML-Form#10).

simbabque commented 2 years ago

For the record, I'm putting this in here. I used a small Dancer2 program and a script for testing.

use Dancer2;

get '/' => sub {
    return <<'HTML';
<html><body>
<form name="foo" action="/upload" method="POST" enctype="multipart/form-data">
    <input type="file" name="logfile">
    <input type="submit" name="submit">
</form>
</body></html>
HTML

};

post '/upload' => sub {
    my $upload = upload('logfile');
    return
        sprintf
        <<'TEXT', $upload->filename, $upload->size, Dumper( $upload->headers ), $upload->content;
filename: %s
size: %s Bytes

headers:
%s

content:
%s
TEXT

};

dance;

And the script:

use strict;
use warnings;
use WWW::Mechanize;

my $mech = WWW::Mechanize->new;
$mech->get('http://localhost:3000/');
$mech->set_fields( 'logfile' =>
        [ [ undef, 'foo.log', Content => 'time,severity,stuff' ], 1 ] );
$mech->submit;

$mech->get('http://localhost:3000/');
$mech->set_fields( 'logfile' =>
         [ [ 't/file_upload.html', 'changed_name.txt', Content => 'stuff' ], 1 ] );
$mech->submit;

I ran this with LWP::ConsoleLogger::Everywhere.

$ perl -MLWP::ConsoleLogger::Everywhere  -I lib test-file-upload.pl