Kalanyr / gogrepoc

Python-based tool for downloading all your GOG.com game and bonus collections to your local computer for full offline enjoyment.
246 stars 36 forks source link

Games that corrupt the manifest DAT file #81

Open bassclarinetl2 opened 1 year ago

bassclarinetl2 commented 1 year ago

Version String: Python 3.11.0 (main, Oct 24 2022, 18:26:48) [MSC v.1933 64 bit (AMD64)] on win32

Hi Kalanyr,

I thought (in a similar vein to https://github.com/Kalanyr/gogrepoc/issues/60 and https://github.com/Kalanyr/gogrepoc/issues/61) I'd pass along a number of games that seem to corrupt the manifest when running py -m gogrepoc update (without a gog-manifest.dat) and the py -m download—related explicitly to null characters being present in the data file (which can be manually resolved by looking for the special regex character \0 and removing the entire game entry (e.g. from {'bg_url: to its corresponding right bracket }. The affected games are listed below the ######################### .

I have also noticed that many of the affected sections have excessively long changelog entries if that helps with diagnostics.

An excerpt of the series of outputs from running py -m gogrepoc download can be found here: https://pastebin.com/dwgKU9Jp

NOTE: In the pastebin ">>> preceded by an empty line is my redacted prompt

#########################

Cossacks Anthology: https://www.gog.com/en/game/cossacks_anthology Cyberpunk 2077: https://www.gog.com/en/game/cyberpunk_2077 Gabriel Knight: Sins of the Fathers: https://www.gog.com/en/game/gabriel_knight_sins_of_the_fathers_20th_anniversary_edition Geneforge 1: https://www.gog.com/en/game/geneforge_15 Planetscape - Torment: https://www.gog.com/en/game/planescape_torment_enhanced_edition SpaceChem: https://www.gog.com/en/game/spacechem Spelunky: https://www.gog.com/en/game/spelunky Neverwinter Nights Diamond Edition: https://www.gog.com/en/game/neverwinter_nights_enhanced_edition_pack Risen: https://www.gog.com/en/game/risen Rise of the Triad: https://www.gog.com/en/game/rise_of_the_triad__dark_war and https://www.gog.com/en/game/rise_of_the_triad Stellaris: https://www.gog.com/en/game/stellaris Stronghold Crusader/HD: https://www.gog.com/en/game/stronghold_crusader The Witcher: https://www.gog.com/en/game/the_witcher The Witcher 2: Enhanced Edition (there seemed to be two entries for this game although only one was affected): https://www.gog.com/en/game/the_witcher_2 SteamWorld Dig: https://www.gog.com/en/game/steamworld_dig Thea 2 The Shattering: https://www.gog.com/en/game/thea_2_the_shattering Yooka Laylee: Impossible Lair: https://www.gog.com/en/game/yookalaylee_and_the_impossible_lair Alan Wake: https://www.gog.com/en/game/alan_wake Alan Wake: American Nightmare: https://www.gog.com/en/game/alan_wakes_american_nightmare

Kalanyr commented 1 year ago

Can you provide some more information please ? I have many of these games and don't get a corrupted manifest from them.

What language are you fetching the data in ?

If the bad character occurs in a place that doesn't leak personal information (or unique URLS) could you please include a copy of that section of the manifest please ?

I have encountered situations where long changelogs include invalid characters but I've fixed that.

On Sun, Dec 18, 2022 at 1:08 PM Will Heid @.***> wrote:

Version String: Python 3.11.0 (main, Oct 24 2022, 18:26:48) [MSC v.1933 64 bit (AMD64)] on win32

Hi Kalanyr,

I thought (in a similar vein to #60 https://github.com/Kalanyr/gogrepoc/issues/60 and #61 https://github.com/Kalanyr/gogrepoc/issues/61) I'd pass along a number of games that seem to corrupt the manifest when running py -m gogrepoc update (without a gog-manifest.dat) and the py -m download—related explicitly to null characters being present in the data file (which can be manually resolved by looking for the special regex character \0 and removing the entire game entry (e.g. from {'bg_url: to its corresponding right bracket }. The affected games are listed below the ######################### .

I have also noticed that many of the affected sections have excessively long changelog entries if that helps with diagnostics.

An excerpt of the series of outputs from running py -m gogrepoc download can be found here: https://pastebin.com/dwgKU9Jp

NOTE: In the pastebin ">>> preceded by an empty line is my redacted prompt

#########################

Cossacks Anthology: https://www.gog.com/en/game/cossacks_anthology Cyberpunk 2077: https://www.gog.com/en/game/cyberpunk_2077 Gabriel Knight: Sins of the Fathers: https://www.gog.com/en/game/gabriel_knight_sins_of_the_fathers_20th_anniversary_edition Geneforge 1: https://www.gog.com/en/game/geneforge_15 Planetscape - Torment: https://www.gog.com/en/game/planescape_torment_enhanced_edition SpaceChem: https://www.gog.com/en/game/spacechem Spelunky: https://www.gog.com/en/game/spelunky Neverwinter Nights Diamond Edition: https://www.gog.com/en/game/neverwinter_nights_enhanced_edition_pack Risen: https://www.gog.com/en/game/risen Rise of the Triad: https://www.gog.com/en/game/rise_of_the_triad__dark_war and https://www.gog.com/en/game/rise_of_the_triad Stellaris: https://www.gog.com/en/game/stellaris Stronghold Crusader/HD: https://www.gog.com/en/game/stronghold_crusader The Witcher: https://www.gog.com/en/game/the_witcher The Witcher 2: Enhanced Edition (there seemed to be two entries for this game although only one was affected): https://www.gog.com/en/game/the_witcher_2 SteamWorld Dig: https://www.gog.com/en/game/steamworld_dig Thea 2 The Shattering: https://www.gog.com/en/game/thea_2_the_shattering Yooka Laylee: Impossible Lair: https://www.gog.com/en/game/yookalaylee_and_the_impossible_lair Alan Wake: https://www.gog.com/en/game/alan_wake Alan Wake: American Nightmare: https://www.gog.com/en/game/alan_wakes_american_nightmare

— Reply to this email directly, view it on GitHub https://github.com/Kalanyr/gogrepoc/issues/81, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKZ337XMUOQOSAUOACQ4PLWNZ5ZJANCNFSM6AAAAAATCI5HXI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Kalanyr commented 1 year ago

Hmmm. Not sure that's the same issue. That looks like GOG either sending bad info or GOGrepoc being run multiple times and writing to the manifest simultaneously.

On Mon, 19 Dec 2022, 04:38 code8buster, @.***> wrote:

I believe I'm getting a similar problem. Here's a demonstrative extract showing things way out of place. I could provide the full file in an email. I also have The Witcher and NVN:Diamond Edition in my list.

'forum_url': 'https://www.gog.com/forum/caves_of_qud', 'galaxyDownloads': [], 'genre': 'Strategy', 'gog_messages': [], 'has_updates': False, 'id': 1625207125, 'image_url': '//images-2.gog-statics.com/423d183ed9056b1640c05a01adffdac2e1b2a4fa874229cb7b2b921e1873e2ae', 'long_title': 'Caves of Qud', 'media_type': '1', 'old_title': None, 'rating': 44, 'release_timestamp': 1531238100, 'serial': '', 'sharedDownloads': [], 'store_url': '/en/game/caves_of_qud', 'title': 'caves_of_qud'}, {'bg_url': '//images-3.gog-statics.com/dc3c4cfbc9ec2fae5c5e1be6956fb926fc0028df2854c8447c73ac0b8807126c', 'changelog': '', 'downloads': [], 'extras': [{'desc': 'manual', 'href': 'https://www.gog.com/downloads/deus_ex/13503', 'lang': '', 'md5': None, 'name': 'deus_ex_manual.zip', 'old_name': None, 'os_type': 'extra', 'prev_verified': False, 'size': 3200151,

I'm not sure what to look for in terms of sensitive info for posting the full manifest.

— Reply to this email directly, view it on GitHub https://github.com/Kalanyr/gogrepoc/issues/81#issuecomment-1356851396, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKZ336OPCSUYBX7AXC6L4TWN5K37ANCNFSM6AAAAAATCI5HXI . You are receiving this because you commented.Message ID: @.***>

bassclarinetl2 commented 1 year ago

Sorry for the delay. Just reran it. and it works once (login, update, and download) and then when running update again to make sure it hasn't missed anything, it throws:


14:52:44 | loading local manifest...
14:52:44 | fatal...
Traceback (most recent call last):
  File "H:\gog\gogrepoc.py", line 2931, in <module>
    main(process_argv(sys.argv))
  File "H:\gog\gogrepoc.py", line 2670, in main
    cmd_update(args.os, args.lang, args.skipknown, args.updateonly, not args.full, args.ids, args.skipids,args.skiphidden,args.installers,args.resumemode,args.strictverify)
  File "H:\gog\gogrepoc.py", line 1274, in cmd_update
    gamesdb = load_manifest()
              ^^^^^^^^^^^^^^^
  File "H:\gog\gogrepoc.py", line 369, in load_manifest
    ad = r.read()
         ^^^^^^^^
  File "<frozen codecs>", line 701, in read
  File "<frozen codecs>", line 504, in read
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfb in position 1004556: invalid start byte```
bassclarinetl2 commented 1 year ago

To answer your earlier question regarding the manifest file itself and where I'm seeing the nul characters:

image

I'm using Sublime Text as the editor.

Kalanyr commented 1 year ago

Actually could you explain what's out of place ? Having looked at it again it looks okay ? Just taken at the intersection. bg_url is the first part of an entry and title is the last (it's sorted alphabetically when output).

On Mon, 19 Dec 2022, 07:32 Kalanyr, @.***> wrote:

Hmmm. Not sure that's the same issue. That looks like GOG either sending bad info or GOGrepoc being run multiple times and writing to the manifest simultaneously.

On Mon, 19 Dec 2022, 04:38 code8buster, @.***> wrote:

I believe I'm getting a similar problem. Here's a demonstrative extract showing things way out of place. I could provide the full file in an email. I also have The Witcher and NVN:Diamond Edition in my list.

'forum_url': 'https://www.gog.com/forum/caves_of_qud', 'galaxyDownloads': [], 'genre': 'Strategy', 'gog_messages': [], 'has_updates': False, 'id': 1625207125, 'image_url': '//images-2.gog-statics.com/423d183ed9056b1640c05a01adffdac2e1b2a4fa874229cb7b2b921e1873e2ae', 'long_title': 'Caves of Qud', 'media_type': '1', 'old_title': None, 'rating': 44, 'release_timestamp': 1531238100, 'serial': '', 'sharedDownloads': [], 'store_url': '/en/game/caves_of_qud', 'title': 'caves_of_qud'}, {'bg_url': '//images-3.gog-statics.com/dc3c4cfbc9ec2fae5c5e1be6956fb926fc0028df2854c8447c73ac0b8807126c', 'changelog': '', 'downloads': [], 'extras': [{'desc': 'manual', 'href': 'https://www.gog.com/downloads/deus_ex/13503', 'lang': '', 'md5': None, 'name': 'deus_ex_manual.zip', 'old_name': None, 'os_type': 'extra', 'prev_verified': False, 'size': 3200151,

I'm not sure what to look for in terms of sensitive info for posting the full manifest.

— Reply to this email directly, view it on GitHub https://github.com/Kalanyr/gogrepoc/issues/81#issuecomment-1356851396, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKZ336OPCSUYBX7AXC6L4TWN5K37ANCNFSM6AAAAAATCI5HXI . You are receiving this because you commented.Message ID: @.***>

Kalanyr commented 1 year ago

Hmm. Interesting that is definitely extremely corrupted, a large chunk of the entry is missing.

Could you backup the current manifest, delete and try a fresh run? Then backup the new manifest as well straight after the update command. Then try the download.

My initial suspicion is that an entry was corrupt but processing has propagated that corruption, so starting over with a new manifest should either resolve the issue (if it's a result of external to gogrepoc issues) or let us identify the source more easily.

On Mon, 19 Dec 2022, 09:01 Will Heid, @.***> wrote:

To answer your earlier question regarding the manifest file itself and where I'm seeing the nul characters:

[image: image] https://user-images.githubusercontent.com/5884987/208323898-147ca368-3b4b-4962-81d1-9c20cec6f151.png

I'm using Sublime Text as the editor.

— Reply to this email directly, view it on GitHub https://github.com/Kalanyr/gogrepoc/issues/81#issuecomment-1356895712, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKZ337FKH7USQBXYXKI2EDWN6JVNANCNFSM6AAAAAATCI5HXI . You are receiving this because you commented.Message ID: @.***>

bassclarinetl2 commented 1 year ago

Sorry for the delayed reply. Deleting the manifest (e.g. rm gog-manifest.dat) followed by an gogrepoc update does appear to fix things. That said, running an update against the existing manifest seems to introduce corruption. At this point its more a nuisance since the workaround seems to fix things.

Kalanyr commented 1 year ago

So as far as you can tell there's no corruption in the existing manifest prior to running the update but there is afterwards ? Could I get you to email me (Kalanyr AT nospam gmail DOT com ) a copy of those two then ? I'd like to examine them. I suspect there's some kind of invisible corruption in the first manifest which the process of reading and updating is spreading. If the update data itself was corrupt the new manifest should be corrupt too.

On Thu, 29 Dec 2022, 13:52 Will Heid, @.***> wrote:

Sorry for the delayed reply. Deleting the manifest (e.g. rm gog-manifest.dat) followed by an gogrepoc update does appear to fix things. That said, running an update against the existing manifest seems to introduce corruption. At this point its more a nuisance since the workaround seems to fix things.

— Reply to this email directly, view it on GitHub https://github.com/Kalanyr/gogrepoc/issues/81#issuecomment-1367054145, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKZ3325VMHL3BINJAKPCBLWPUDIRANCNFSM6AAAAAATCI5HXI . You are receiving this because you commented.Message ID: @.***>