roclark / sportsipy

A free sports API written for python
MIT License
484 stars 187 forks source link

NFL Boxscore: Document is Empty #729

Open mikeyru123 opened 2 years ago

mikeyru123 commented 2 years ago

Describe the bug Trying to pull boxscores for the 2021 NFL season and result is error saying "Document is empty"

To Reproduce Sample code which causes an issue.

!pip install sportsipy

from sportsipy.nfl.boxscore import Boxscores, Boxscore

game_str = Boxscores(7,2021).games['7-2021'][0]['boxscore']
game_stats = Boxscore(game_str)
game_stats.dataframe

Expected behavior would like to see the boxscores of the games played week 7 of the 2021 season

software Using Google Colab on Chrome

image image

pepaananen commented 2 years ago

I am having same issue. Totally unable to get any stats for a given boxscore.

datasportslab commented 2 years ago

I am also having this issue. Any solution yet?

CodeeMcCoderson commented 2 years ago

I am having similar issue. Pasting below.

Traceback (most recent call last): File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\pyquery\pyquery.py", line 57, in fromstring result = getattr(etree, meth)(context) File "src\lxml\etree.pyx", line 3252, in lxml.etree.fromstring File "src\lxml\parser.pxi", line 1913, in lxml.etree._parseMemoryDocument File "src\lxml\parser.pxi", line 1793, in lxml.etree._parseDoc File "src\lxml\parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc File "src\lxml\parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc File "src\lxml\parser.pxi", line 725, in lxml.etree._handleParseResult File "src\lxml\parser.pxi", line 654, in lxml.etree._raiseParseError File "", line 1 lxml.etree.XMLSyntaxError: Document is empty, line 1, column 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\users\python scripts\SportsNFL.py", line 266, in pred_games_df, comp_games_df = prep_test_train(current_week,weeks,year) File "C:\users\python scripts\SportsNFL.py", line 242, in prep_test_train game_data = Boxscore('202112190buf') File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\sportsipy\nfl\boxscore.py", line 296, in init self._parse_game_data(uri) File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\sportsipy\nfl\boxscore.py", line 784, in _parse_game_data value = self._parse_name(short_field, boxscore) File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\sportsipy\nfl\boxscore.py", line 447, in _parse_name return pq(str(boxscore(scheme)).strip()) File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\pyquery\pyquery.py", line 217, in init elements = fromstring(context, self.parser) File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\pyquery\pyquery.py", line 61, in fromstring result = getattr(lxml.html, meth)(context) File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\lxml\html__init__.py", line 875, in fromstring doc = document_fromstring(html, parser=parser, base_url=base_url, **kw) File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\lxml\html__init__.py", line 763, in document_fromstring raise etree.ParserError( lxml.etree.ParserError: Document is empty

CodeeMcCoderson commented 2 years ago

If you all have not seen yet, this was fixed with the following commit

https://github.com/roclark/sportsipy/pull/725/commits/e2aabf3dd8d7609b87a9d2e6cf732eb4a9cc0a25

RichardSJTotten commented 2 years ago

I'm getting a similar error when attempting to access boxscores.

File "src/lxml/etree.pyx", line 3252, in lxml.etree.fromstring File "src/lxml/parser.pxi", line 1913, in lxml.etree._parseMemoryDocument File "src/lxml/parser.pxi", line 1793, in lxml.etree._parseDoc File "src/lxml/parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult File "src/lxml/parser.pxi", line 663, in lxml.etree._raiseParseError File "", line 3543 lxml.etree.XMLSyntaxError: line 3543: b'Tag use invalid'

@CodeeMcCoderson do you know how I can resolve this?

CodeeMcCoderson commented 2 years ago

@RichardSJTotten to fix the problem that I had, which seems almost identical to yours, is I changed the sportsipy module directly on my hard drive. I did not pull down the fix that I posted earlier.

If you navigate to where you have your modules stored on your local machine, find the 'sportsipy' module and got into it. Then go into the 'nfl' module and click on the 'constants.py' scripts.

Within that script navigate to line 81 and change this line of code: 'home_name': 'a[itemprop="name"]:first', To this: 'home_name': 'div[class="linescore_wrap"] table tbody tr:last td:nth-child(2)',

Next go to line 84 and change this line of code: 'away_name': 'a[itemprop="name"]:last', To this: 'away_name': 'div[class="linescore_wrap"] table tbody tr:first td:nth-child(2)',

After changing it, save the script, navigate to your script that was throwing the error and run it again. It should work.

Let me know if anything was not clear or if it does not work, I will try and help more.

RichardSJTotten commented 2 years ago

Thanks @CodeeMcCoderson !! Really appreciate the response 👍

This fix worked for me after tinkering a little. Turns out I had to pip uninstall the package first and then install locally for it to work as expected.

Do you know if @roclark is planning to merge any of the new commits that have been worked out that solve issues like this? Also @roclark - this package is great! Thanks for creating it.

selvamshan commented 2 years ago

@RichardSJTotten to fix the problem that I had, which seems almost identical to yours, is I changed the sportsipy module directly on my hard drive. I did not pull down the fix that I posted earlier.

If you navigate to where you have your modules stored on your local machine, find the 'sportsipy' module and got into it. Then go into the 'nfl' module and click on the 'constants.py' scripts.

Within that script navigate to line 81 and change this line of code: 'home_name': 'a[itemprop="name"]:first', To this: 'home_name': 'div[class="linescore_wrap"] table tbody tr:last td:nth-child(2)',

Next go to line 84 and change this line of code: 'away_name': 'a[itemprop="name"]:last', To this: 'away_name': 'div[class="linescore_wrap"] table tbody tr:first td:nth-child(2)',

After changing it, save the script, navigate to your script that was throwing the error and run it again. It should work.

Let me know if anything was not clear or if it does not work, I will try and help more.

khampel commented 2 years ago

Thanks for the adjustment for the code @selvamshan and @CodeeMcCoderson. I am getting an empty document as well but this has an issue with the SCHEDULE_SCHEME in the constants file maybe? URL looks to be working still as well. I also switched lines 81 and 84 as well. Here is the code and error. Thanks.

`from sportsipy.nfl.schedule import Schedule

team_one_df_org = pd.DataFrame() GNB_schedule = Schedule(team_one) for game in GNB_schedule:

    games = game.dataframe_extended
    team_one_df_org = team_one_df_org.append(games, ignore_index = True)

team_one_df_org`

src/lxml/etree.pyx in lxml.etree.fromstring()

src/lxml/parser.pxi in lxml.etree._parseMemoryDocument()

src/lxml/parser.pxi in lxml.etree._parseDoc()

src/lxml/parser.pxi in lxml.etree._BaseParser._parseUnicodeDoc()

src/lxml/parser.pxi in lxml.etree._ParserContext._handleParseResultDoc()

src/lxml/parser.pxi in lxml.etree._handleParseResult()

src/lxml/parser.pxi in lxml.etree._raiseParseError()

XMLSyntaxError: Document is empty, line 1, column 1 (, line 1)

Laneville commented 1 year ago

@CodeeMcCoderson It looks like this patch works for previous games that have already been played, but I am still getting this DocumentEmpty error for any games that have not been played yet

ericmk52 commented 1 year ago

@CodeeMcCoderson It looks like this patch works for previous games that have already been played, but I am still getting this DocumentEmpty error for any games that have not been played yet

The fix appears to work for all seasons prior to the current 2022 season.

calebhacala commented 1 year ago

@CodeeMcCoderson is this fix still relevant, I tried it but am still getting the same error