This could be a user error but have tried every permutation I can think of without success.
I'm using the versin of scrapemark.py updated on Aug 11, 2011.
Here is an example. If I pull the nested part out and manually split by then scrapemark will process each line correctly, but the nested version only finds the first match.
This could be a user error but have tried every permutation I can think of without success.
I'm using the versin of scrapemark.py updated on Aug 11, 2011.
Here is an example. If I pull the nested part out and manually split by
then scrapemark will process each line correctly, but the nested version only finds the first match.
from scrapemark import scrape
src = ''' <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n\n\n\n\t
\n\n\t\t\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\t\t\t\t\n\t\t
\n\t
\n\n\t
UK Horse Racing Results
\n\nSunday, 10 June 2012
\n
THIS ONLY RETURNS THE FIRST MATCH
results = scrape("""
UK Horse Racing Results
print results
THIS WORKS
results = scrape("""
UK Horse Racing Results
src = results["results"].replace("\n", "")
x = src.split("
") for item in x:
print results
--------- RESULTS -----
{'third': u'4 Hard Yards (C D Hayes, 16-1 )', 'h': u'2', 'm': u'20', 'n': u'8', 'course': [u'Curragh'], 'second': u'5 Leitir Mor (K J Manning, 11-10 fav)', 'date': u'Sunday, 10 June 2012', 'first': u"2 Gale Force Ten (J P O'Brien, 7-2 )"} {'third': u'4 Hard Yards (C D Hayes, 16-1 )', 'h': u'2', 'm': u'20', 'n': u'8', 'second': u'5 Leitir Mor (K J Manning, 11-10 fav)', 'first': u"2 Gale Force Ten (J P O'Brien, 7-2 )"} {'third': u'4 Flying Doha (W J Lee, 7-2 2nd-fav)', 'h': u'2', 'm': u'50', 'n': u'15', 'second': u'3 Cape Of Approval (W Lordan, 2-1 fav)', 'first': u'10 Alsium (C D Hayes, 7-1 )'} {'third': u'10 Erins Gal (R P Cleary, 20-1 )', 'h': u'3', 'm': u'20', 'n': u'12', 'second': u'5 Battleroftheboyne (B A Curtis, 12-1 )', 'first': u'7 Kateeva (L F Roche, 14-1 )'} None None None None None {'date': u'Sunday, 10 June 2012', 'course': [u'Curragh'], 'results': "\n\n 2:20 : 2 Gale Force Ten (J P O'Brien, 7-2 ); 5 Leitir Mor (K J Manning, 11-10 fav); 4 Hard Yards (C D Hayes, 16-1 ); 8 ran. 6 Newberry Hill (F M Berry, 11-4 2nd-fav);
\n\n 2:50 : 10 Alsium (C D Hayes, 7-1 ); 3 Cape Of Approval (W Lordan, 2-1 fav); 4 Flying Doha (W J Lee, 7-2 2nd-fav); 15 ran.
\n 3:20 : 7 Kateeva (L F Roche, 14-1 ); 5 Battleroftheboyne (B A Curtis, 12-1 ); 10 Erins Gal (R P Cleary, 20-1 ); 12 ran. 2 Lake George (R P Whelan, 5-1 joint-fav); 3 Allegra Tak (P J Smullen, 5-1 joint-fav);
\n 3:50 : 4 Sharestan (N G McCullagh, 8-11 fav); 2 Defining Year (S Foley, 8-1 ); 7 ran. 7 Learn (C O'Donoghue, 3-1 2nd-fav);
\n\n
"}