cyberitsolutions / alloc-cli

A CLI that uses the alloc API
GNU Affero General Public License v3.0
1 stars 2 forks source link

Improve alloc-cli security and quality. #20

Open cjbayliss opened 5 years ago

cjbayliss commented 5 years ago

This is from PR #9:

On the 3rd of Jan 2019, Alex said:


Some great feedback from twb:

Now "python3-bandit -r ." works and found a couple dozen security issues.

Looks like the most serious one is the local shell injection of MAILER or BROWSER. Perhaps just a simple whitelist of accepted values would mitigate it.

The correct way to solve that is to not use os.system(). Ever.

subprocess.check_call([os.environ.get("BROWSER", "sensible-browser"), url])

This also avoids the yukky string concatenation where you try to quote the URL for system().

The only downside to this is that you cannot do something like BROWSER="chromium -no-sandbox" --- but this is already broken in plenty of other apps, and the workaround is simply to put that into a one-line shell script in your $PATH.


For alloc.err() and .dbg() I suggest

https://docs.python.org/3/howto/logging.html#when-to-use-logging

This has a "--quiet" concept built into it.

You can also do

print("Foo is", foo, file=sys.stderr, flush=True)

instead of

sys.stderr.write("!!! " + str(s) + "\n")
sys.stderr.flush()

Also don't do string concatenation, do format strings (python's printf), because they don't care if the arguments are strings or something else

'{} is {}'.format(x, y)

https://docs.python.org/library/string.html#format-examples https://docs.python.org/library/string.html#formatspec

In Python 3.7+ you can simplify this further, and just write

f'{x} is {y}'

(note I'd probably stick to the "".format() syntax, so as to not exclude python 3.6 users -- alex)


For print_task() you can create mail objects using email.mime.text or similar, e.g.

    m = email.mime.text.MIMEText(disclaimer)
    m['Date'] = email.utils.formatdate()
    m['From'] = 'noreply@{} (hello)'.format(myorigin)
    m['To'] = disclaimer_recipient
    m['Subject'] = 'Auto: {} email terms of use'.format(
        p.config.get('com.p.site-name'))
    m['Precedence'] = 'junk'
    if message_object and 'Message-ID' in message_object:
        # This is a response to a message sent.
        m['In-Reply-To'] = m['References'] = message_object['Message-ID']
        m['Auto-Submitted'] = 'auto-replied'
    else:
        # We can be called from padm when adding to a whitelist.
        m['Auto-Submitted'] = 'auto-generated'
    disclaimer_message = m

    # Hide errors because we might be sending a disclaimer to
    # a bogus address (e.g. spammer).
    # addresses by changing e.g.
    # "alice@gmail.com" to
    # "alice@gmail.com.INVALID". —twb, Jun 2017
    p.utils.sendmail(
        disclaimer_message.as_string(),
        hide_errors=True)

This (in principle) handles escaping automatically, but in practice has some derpage. The mbox "From " line can be managed automatically using

https://docs.python.org/library/mailbox.html

Instead of urllib/urllib2/urllib3, strongly recommend third-party "requests" library.

resp = requests.get('https://example.com/data.json')
resp.raise_for_status()   # throw exception on 4xx or 5xx
my_dict = resp.json()

For XML, third-party lxml is convenient for XPATH

base_url = 'https://example.com/data.html'
resp = requests.get(base_url)
resp.raise_for_status()   # throw exception on 4xx or 5xx
obj = lxml.html.fromstring(resp.text)
for url in obj.xpath('//div[@id="porn"]//@src'):
    url = urllib.parse.urljoin(base_url, url)
    print(url)

...but you probably prefer to use CSS selectors rather than XPATH. https://lxml.de/cssselect.html

Full example, using a persistent session to remember login cookie:

with requests.session() as sess:
    # spoof U-A
    sess.headers['User-Agent'] = ('Mozilla/5.0 (X11; Linux x86_64)'
                                  ' AppleWebKit/537.36 (KHTML, like Gecko)'
                                  ' Chrome/51.0.2704.106 Safari/537.36')
    # get nonce (tumblr-form-key)
    resp = sess.get('https://www.tumblr.com/login')
    resp.raise_for_status()
    form_key, = lxml.html.fromstring(resp.text).xpath('//meta[@name="tumblr-form-key"]/@content')
    # actually log in
    resp = sess.post('https://www.tumblr.com/login',
                     data={'user[email]': 'jkjkfsddsffe@mailinator.com',
                           'user[password]': 'pass',
                           'form_key': form_key})
    resp.raise_for_status()

    # Now logged in, ordinary gets will work.
    resp = sess.get('https://example.tumblr.com/page/1')
cjbayliss commented 5 years ago

Additionally I noted on that PR that we should use the built in json library from python3 instead of simplejson.