jjlee / mechanize

Stateful programmatic web browsing in Python, after Andy Lester's Perl module WWW::Mechanize .
http://wwwsearch.sourceforge.net/mechanize/
618 stars 123 forks source link

Mechanize won't detect all forms from a simple html page #82

Open eLvErDe opened 11 years ago

eLvErDe commented 11 years ago

Hi,

It seems the "forms" parsing is highly broken, here is my simple code snippet:

#!/usr/bin/python

import mechanize

br = mechanize.Browser()
br.set_handle_equiv(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)

r = br.open('http://192.168.1.1/login')
for f in br.forms():
    print f

Returns:

<POST http://192.168.1.1/login application/x-www-form-urlencoded
  <HiddenControl(page_ref=) (readonly)>
  <HiddenControl(method=button) (readonly)>
  <SubmitButtonControl(submit_button=off) (readonly)>>

However if you look at the original HTML page, there are two forms:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr" lang="fr">
  <head>
  <title>neufbox -&nbsp;Authentification</title>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <meta http-equiv="Cache-Control" content="no-cache, must-revalidate" />
  <meta http-equiv="Expires" content="Mon, 26 Jul 1997 05:00:00 GMT" />
  <meta http-equiv="Pragma" content="no-cache" />
  <meta http-equiv="Content-Script-Type" content="text/javascript" />
  <link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
  <link rel="stylesheet" type="text/css" media="screen" href="/css/reset.css" />
  <link rel="stylesheet" type="text/css" media="screen" href="/css/common.css" />
  <link rel="stylesheet" type="text/css" media="screen" href="/css/login.css" />
  <script type="text/javascript" src="/js/global.js"></script>
  <script type="text/javascript" src="/js/login.js"></script>
  </head>
  <body>
  <div id="header">
  <div id="logo">
  <a href="/" title="Accueil">Accueil</a>
  </div>
  <div id="infos">
  <table>
  <tr>
  <th>Version&nbsp;</th>
  <td>: NB6-MAIN-R3.2.12</td>
  </tr>
  <tr>
  <th>Adresse MAC&nbsp;</th>
  <td>: 30:7e:99:99:99:99</td>
  </tr>
  <tr>
  <th>Adresse IP&nbsp;</th>
  <td>: 80.99.99.99</td>
  </tr>
  <tr>
  <th>Profil d'accès&nbsp;</th>
  <td>:&nbsp;neufbox ADSL&nbsp;
  </td>
  </tr>
  </table>
  </div>
  </div>
<div id="menu">
  <ul>
  <li id="id_state_tab" class="tab_off">
  <a href="/state" title="Etat">Etat</a>
        </li>
  <li id="id_network_tab" class="tab_off">
  <a href="/network" title="Réseau">Réseau</a>
        </li>
  <li id="id_wifi_tab" class="tab_off">
  <a href="/wifi" title="Wifi">Wifi</a>
        </li>
  <li id="id_hotspot_tab" class="tab_off">
  <a href="/hotspot" title="Hotspot">Hotspot</a>
        </li>
  <li id="id_service_tab" class="tab_off">
  <a href="/service" title="Applications">Applications</a>
        </li>
  <li id="id_maintenance_tab" class="tab_off">
  <a href="/maintenance" title="Maintenance">Maintenance</a>
        </li>
  <li id="id_eco_tab" class="tab_off">
  <a href="/eco" title="Eco">Eco</a>
        </li>
  </ul>
</div>
<div id="submenu">
  <ul>
  <li class="tab_on">
  <a href="/login" title="Général">Général</a>
  </li>
  </ul>
</div>
  <div id="main">
<div class="info_notice" id="access_lock">
  <h1>Accès verrouillé</h1>
  Pour vous identifier, suivez les instructions ci-dessous :
</div>
<div class="title">
  <h1 class="large">Identification par bouton service</h1>
</div>
<div class="content">
  <img src="/img/img_led_service_nb6.png" align="right" />
  <strong>Appuyez environ 5 secondes</strong> sur le bouton service de votre neufbox jusqu'à ce qu'il clignote et cliquez sur le bouton <strong>Continuer</strong>.
  <br/><br/>
  <form method="post" action="/login" id="form_web_button">
  <input type="hidden" name="page_ref" value="/error" />
  <input type="hidden" name="method" value="button" />
  <div id="div_button_continue" class="button_submit">
  <button type="submit" 
                id="button_continue" 
                name="submit_button" 
                value="off">
  Continuer
  </button>
  </div>
  </form>
  <div class="spacer"></div>
</div>
<div class="title">
  <h1 class="large">Identification par mot de passe</h1>
</div>
<div class="content">
  Saisissez votre identifiant et votre mot de passe puis cliquez sur le bouton <strong>Valider</strong>.
  <br/>
  <form method="post" action="/login" id="form_auth_passwd">
  <input type="hidden" name="method" value="passwd" />
  <input type="hidden" name="page_ref" value="/error" />
  <input type="hidden" name="zsid" id="zsid" />
  <input type="hidden" name="hash" id="hash" />
  <table id="web_authentication">
  <tr>
  <th scope="row">Identifiant</th>
  <td>
  <input type="text" class="text" name="login" id="login" size="30" />
  </td>
  </tr>
  <tr>
  <th scope="row">Mot de passe</th>
  <td>
  <input type="password" class="text" name="password" id="password" size="30" />
  </td>
  </tr>
  </table>
  <div class="button_submit">
  <button type="submit" name="submit_button">Valider</button>
  </div>
  </form>
</div>
        </div>
        <div id="help">
        <h1 id="help_title"><span>Aide</span></h1>
        <p id="help_text"><em>Identification par mot de passe :</em> Saisissez votre identifiant et votre mot de passe pour accéder à l'interface d'administration de votre neufbox. Par défaut, l'identifiant est <strong>admin</strong> et le mot de passe est <strong>le code WiFi (WPA-PSK)</strong> se trouvant derrière votre neufbox.<br/><br/><em>Identification par bouton service :</em> Appuyez sur ce bouton pendant quelques secondes jusqu'à ce qu'il clignote puis cliquez sur <strong>Continuer</strong>.</p>
        </div>
    </body>
</html>

Looking forward for you answer!

Regards, Adam