Closed mrhappyasthma closed 5 years ago
The abbreviations seem to map from:
NYSE -> NYS
NASDAQ -> NAS
AMEX -> ASE
Actually an even better way seems to be:
https://www.msn.com/en-us/money/stockdetails/analysis?symbol=<symbol>
Seems to work well using lxml to parse the html.
Example code:
from lxml import html
import urllib2
url = 'https://www.msn.com/en-us/money/stockdetails/analysis?symbol=goog'
response = urllib2.urlopen(url)
tree = html.fromstring(response.read())
def isfloat(value):
if value is None:
return False
try:
float(value)
return True
except ValueError:
return False
def nextFloatFromIterator(iterator):
node = None
while node is None or not isfloat(node.text):
node = next(iterator)
return node.text
tree_iterator = tree.iter()
for element in tree_iterator:
text = element.text
if text == 'P/E Ratio 5-Year High':
print '5 year high:'
print nextFloatFromIterator(tree_iterator)
if text == 'P/E Ratio 5-Year Low':
print '5 year low:'
print nextFloatFromIterator(tree_iterator)
Done as of 84cd63734b9a98162a332790fd09a9e97e96271a
This one is trickier. Haven't found a good way to format the URL. It seems to be:
https://www.msn.com/en-us/money/stockdetails/analysis/fi-126.1...
We'd also need to scrape the price ratios. Ideally this would be a csv download of some sort, but cant seem to find one.