Sude- / lgogdownloader

LGOGDownloader is unofficial downloader to GOG.com for Linux users. It uses the same API as the official GOG Galaxy.
https://sites.google.com/site/gogdownloader/
Do What The F*ck You Want To Public License
725 stars 68 forks source link

transform internal variables (with regex?) #238

Open shakeyourbunny opened 1 year ago

shakeyourbunny commented 1 year ago

Is there a way to transform the provided game names from the GOG api to something custom and edit them before commiting to disk (mainly %gamename%)?

Main reason why to I would like have such a possibility is that I use the slug names for the game directories and sometimes GOG does things like below listed. Main reason is to group the series together by same basic names and or avoiding duplicate downloads and saving disk space.

I do know that symlinking does solve this request, but it is really a problem for collectors to have that.

My idea is that you have some sort of 'sed' like syntax or you can put these (better) in a text file like the "blacklist.txt" a ka "transformations.txt", where lgogdownloader parses it by group or / line like:

%variable%/search/replace/i

where at the end:

Some reasons in the slug names that GOG does and could be fixed after getting the slugs and before using them (eg for path names are): removing a "_copy3" to the slug examples:

The fun part of these is that there are TWO versions of a game if you own it, one with, one without and both have different versions or the same, as likely the "old" name has been abandoned, but if you delete the old slug, it will be redownloaded. Some offenders include "act of war high treason".

removing _base from the game names why does GOG this: because of DLCs / editions stuff I think

sorting out the "the" and "the" issues why does GOG this: dunno, is often seemingly random dice or packagers at GOG who have different opinions where that "the_" and "_the" should go, is really

My personal opinion is that such game names should stick with "the_" at the beginning, but your mileage may vary.

some games have an "episode_x_slugname" why does GOG this: dunno, it is really bothersome and bonkers, the person responsible for these slugs should be fired.

This are two (episodic game(s series)):

wild jumbled naming scheme why does GOG this: perhaps for every game, another GOG employee with his/hers extra different opinion how to name the slug was allowed to do that.

this is mainly concerning the elder scrolls franchise, which naming scheme is wildy jumbled, with bonus of the "_game" problem:

I dunno though how to transform this mess with only search & replace syntax to sane defaults (eg numbering through like "the_elder_scrolls_x_gametitle") and chopping of the chapter and stuff..

I hope this feature would be useful.

shakeyourbunny commented 1 year ago

Some examples for the transforms.txt:

a # and blank lines should be ignored

Sude- commented 1 year ago

I'm thinking about using json as the config format for this The config would look something like this:

{
    <string> :
    {
        "regex" : <string>,
        "replacement" : <string>
    },
    <string> :
    {
        "regex" : <string>,
        "replacement" : <string>
    }
}

node names can be regex string to match gamename then "regex" key is used for the "replacement"

An example config would look something like this match all games beginning with "b" and if they end with "_the" then remove "_the" at the end and prefix it with "the_" match all games with game_of_the_year_edition and replace with goty

{
    "^b" :
    {
        "regex" : "(.*)_the$",
        "replacement" : "the_\\1"
    },
    ".*game_of_the_year_edition.*" :
    {
        "regex" : "game_of_the_year_edition",
        "replacement" : "goty"
    }
}
shakeyourbunny commented 1 year ago

would also be a neat idea, if something in this fashion would be implemented.

doing that only as shell parameter is too much i think, that is the reason why I suggested this to have a configuration file for it.

Sude- commented 1 year ago

dabfcfc adds support for transforming gamenames

--list transform can be used to show transformations Use the new subdir template %gamename_transformed% to change the directory name

$ cat ~/.config/lgogdownloader/transformations.json
{
    "witcher" :
    {
        "regex" : "^the_",
        "replacement" : "",
    },
}

$ lgogdownloader --list transform --game witcher                                 
Getting game names (3/3) 72 / 72
gwent_the_witcher_card_game -> gwent_the_witcher_card_game
gwent_the_witcher_card_game_ptr -> gwent_the_witcher_card_game_ptr
the_witcher_2 -> witcher_2
the_witcher_3_wild_hunt_game_of_the_year_edition_game -> witcher_3_wild_hunt_game_of_the_year_edition_game
the_witcher_goodies_collection -> witcher_goodies_collection
the_witcher -> witcher
shakeyourbunny commented 1 year ago

Thank you for your fast implementation, but I have some suggestions and found some things to consider:

{
    "_game$":
    {
       "regex": "_game$",
       "replacement": ""
    },
    ".*_the$" :
    {
        "regex" : "(.*)_the$",
        "replacement" : "the_\\1"
    },
}

Applying that to "back_to_the_future_the_game" yields this:

$ gogdownloader --list transformations --game back_to
back_to_the_future_the_game -> back_to_the_future_the

Aside from that, it would be really neat to have an exception and/or static transformation (matches only to full name) list built-in. The difference between "exception" and "static" is that if a full game name is found in the exception list, it is not transformed and the static list contains fixed replacements.

{
    "_game$":
    {
       "regex": "_game$",
       "replacement": ""
       "exceptions": [
                "d_the_game"
                "an_elder_scrolls_legend_battlespire_game",        ## this is the official game title!
                "back_to_the_future_the_game",
                "darkest_hour_a_hearts_of_iron_game",
                "for_the_glory_a_europa_universalis_game",
                "gwent_the_witcher_card_game"
                ],
        "static": [
                { 
                  "match":  "randals_monday_base_game",
                  "replace": "randals_monday"
                ]
    }
}

this would yield to:

d_the_game -> d_the_game
gwent_the_witcher_card_game -> gwent_the_witcher_card_game
randals_monday_base_game -> randals_monday
greedfall_game -> greedfall
quake_4_game -> quake_4
pathologic_classic_hd_game -> pathologic_classic_hd
Sude- commented 1 year ago

Applying multiple rules should be fixed by a1a7fb4

Sude- commented 1 year ago

Exceptions should be relatively easy to implement. Although I'll probably have to think about the format some more. It might also be nice to have an option to stop matching rules after certain rule is matched. It could also be used to create exceptions although not as well defined as exception list inside the rule. Perhaps something like this

{
    ".*_(card|the)_game$":
    {
       "regex": "this_doesnt_actually_matter",
       "replacement": "this_doesnt_matter_either",
       "continue": false,
    },
    "_game$":
    {
       "regex": "_game$",
       "replacement": "",
       "continue": true,
    },
}
Sude- commented 1 year ago

06c033e adds support for exceptions

I'll probably do some changes to the rules file format later. I'm thinking about combining blacklist and ignorelist to the same rules file.