edwardslabs / CloudBot

CloudBot - The simple, fast, expandable, open-source Python IRC Bot!
GNU General Public License v3.0
28 stars 31 forks source link

Hororscope.com changed page/URL layout #99

Closed IlGnome closed 7 years ago

IlGnome commented 7 years ago

Signs are now represented with a number starting at 1 for Aeries.

example: 'http://www.horoscope.com/us/horoscopes/general/horoscope-general-daily-today.aspx?sign=2'

The hororscope is now in just between a set of

tags.

Dirty code for my use later: url = 'http://www.horoscope.com/us/horoscopes/general/horoscope-general-daily-today.aspx?sign=2' r = requests.get(url) soup = BeautifulSoup(r.text, 'lxml') scope = soup.find('p') print(scope.text.strip())

edwardslabs commented 7 years ago

Thanks for looking at this. Not sure if you have all the signs mapped to numbers yet. but I found this in the site source code.

              <option value="1">Aries</option>
              <option value="2">Taurus</option>
              <option value="3">Gemini</option>
              <option value="4">Cancer</option>
              <option value="5">Leo</option>
              <option value="6">Virgo</option>
              <option value="7">Libra</option>
              <option value="8">Scorpio</option>
              <option value="9">Sagittarius</option>
              <option value="10">Capricorn</option>
              <option value="11">Aquarius</option>
              <option value="12">Pisces</option>

The only other suggestion I would make is to specify the div that the \<p> element is in that you want. This way if they insert another p tag before the one you want the code should still work. The line below should work.

soup.find("div", class_="horoscope-content").find("p").text