find_link_by_text doesn't like nested tags

medwards commented 11 years ago

Nested tags within link tags cause find_link_by_text and find_link_by_partial_text to fail even when they should. For example, save https://gist.github.com/medwards/8ccbbe698a46ec21e48a to /tmp

from splinter import Browser
foo = Browser()
foo.visit('file:///tmp/nested-stuff.html')
print foo.find_link_by_text("split")
print foo.find_link_by_text("sp")
print foo.find_link_by_text("sp l it")  # this is the output given by foo.find_by_css('table')[1].text

hltbra commented 11 years ago

Hello, @medwards. Your case looks exceptional to me, because when I find for text in links I want to look for the a content, and what you want is to look for text that is rendered inside an a tag.

@andrewsmedina and @fsouza, do you have any opinion about this issue?

medwards commented 11 years ago

That is a fair point. If the method was called find_link_by_HTML_content. There is either a poorly named function or an incorrect implementation. On Mar 11, 2013 3:45 AM, "Hugo Lopes Tavares" notifications@github.com wrote:

Hello, @medwards https://github.com/medwards. Your case looks exceptional to me, because when I find for text in links I want to look for the a content, and what you want is to look for text that is rendered inside an a tag.

@andrewsmedina https://github.com/andrewsmedina and @fsouzahttps://github.com/fsouza, do you have any opinion about this issue?

— Reply to this email directly or view it on GitHubhttps://github.com/cobrateam/splinter/issues/219#issuecomment-14695400 .

danilobellini commented 11 years ago

This issue makes the find_link_by_partial_text() method way to limited. That's not only "HTML content", but nested text too (e.g. <a><div>This text is nested</div>, but this ending isn't</a>). You can use xpath here to avoid this specific problem (below is a py.test-flavored example):

from splinter import Browser

def test_something():
    with Browser() as br:
        url = "https://github.com/danilobellini"
        br.visit(url)
        tag_name = "a"
        text_to_find = "audiolazy"
        expr = "//{}[contains(.//*[text()],'{}')]".format(tag_name, text_to_find)
        links = br.find_by_xpath(expr)
        assert len(links) == 1
        links.pop().click()
        assert br.is_text_present("(1 + z ** -2).plot().show()")

That finds the link (nested in github at the time of writing). =)

The other way around would be filtering:

links = [el for el in br.find_by_css(tag_name) if text_to_find in el.html]

But still uses HTML, This text is nested, but this ending isn't without the closing </div> would be easier to make a query. Am I the only one who thinks regexes should be used instead of "partial text"?

cobrateam / splinter

find_link_by_text doesn't like nested tags #219