Parametrised export options and general improvements

Varstahl commented 4 years ago

added: optional CSV delimiter for localised import
~removed: useless platformList queries~
added: summary export
Introduced a mini connection manager, to ensure proper cleanup even on an unhandled exception
Split the queries into manageable chunks
Created a wrapper to make addition of new fields manageable and self-contained
Renamed some of the MasterList aliases to avoid confusion
Fixed text parsing and included CR in the new line checks
Partial DLC exclusion (their library is a bit of a mess, we'll need some manual specification)

It's quite a bit to digest, and slightly overkill, but "in for a penny, in for a pound", am I right?

For the sake of keeping things DRY I created a couple mini wrappers that reduce the boilerplate needed for this thing to work, mainly the ba(), id() and jsl() functions, all tiny and documented. Also, to keep things independent without going crazy, I created a parametrised prepare() function, that setups the query and helper variables to be used when parsing results.

For example, after the introduction of prepare(), when adding fields to the CSV we no longer have to track the results order manually, accessing them results[int] is not viable and results[positions['name']] avoids mistakes and debug headaches.

There is also the option to provide CSV fields by parsing content already available from the query. ~While doing this I noticed that the platformList queries were completely unnecessary and redundant, so I stripped the code of them.~

As for the jsl() I've added some parsing to be used with title and summary, that cleans up a bit of \n and <br/> to makes the CSV more consistent and accessible.

Let me know what you think.

Edit: noticed that the platformList was helpful in de-duping elements with diverging titles, so I reintroduced it.

AB1908 commented 4 years ago

This is fantastic. I have a suggestion though and I'd like to hear your thoughts.

Would it be possible to modularise a little more? I'm seeing a lot of row['key'] = metadata['key'] and I feel like the argument key could be passed to a separate function which would reduce the row=metadata instances.

Varstahl commented 4 years ago

I guess we could generalise that with a function that takes an enum as a parameter, so that it automatically tries to convert based on the input requirements.

Another thing that I thought of doing was defaulting every option to True when -a is specified, so instead of checking .all or .field we could just check .field. Then I thought that someone might not want to get a particular field always exported, so I haven't done it (yet, at least).

Also, partially marginal, I published this tool a couple days ago. I'm still working on it but it's working great imo, although there's some more work to do on both tools.

In any case it sounds reasonable, I was just about to start checking a couple things of the export procedure, I'll get on these changes right away.

Varstahl commented 4 years ago

@AB1908 implemented both and also fixed a problem with carriage returns, let me know if there's more to add.

Edit: while working on DLC filtering I found cases where the conversion from list to comma separated is not recommended and would require lots of custom parsing (ie: "whatever company, ltd"). I'm removing the list to string to let the CSV module do its thing.

Varstahl commented 4 years ago

@AB1908 I think I've done all in my power to fix the issues we had with DLCs and such. The only liberty I took was to replace sorted with natsorted. If you don't want the additional python dependency you can merge this PR and then replace the two instances of natsorted with sorted, and other than numbers sorted ASCII-betical, the rest will work just fine.

There is only one outstanding problem with the export, that is with items renamed as duplicate <rest of the title>, but there's no easy way to fix this because it changes based on the user's libraries and platforms, so I'd rather not address that in the exporter, where it doesn't belong. Especially considering that some games listed as duplicates are the only entry to begin with.

Also, I didn't quite find a way to "detect" DLCs easily. TEST DEVELOPER(?: 2)? and TEST PUBLISHER(?: 2)? are used for actual CDPR games/DLCs, so I can't exclude them. Some of the DLC entries are also duplicate, therefore I couldn't filter them out by analysing games DLCs. I ended up doing both to the best of my abilities, excluding known DLC ids from the list, plus a few titles that kept popping up (ref: Varstahl@gle.py#L265-L273).

If you guys find a better way, it should be even better.

As for the rest, this PR has now all the QoL improvements I wanted from it, but I'm still up to adding/removing things if you need them, just let me know.

Varstahl commented 4 years ago

~Introduced a bug in the export somewhere down the line, don't merge it yet.~ fixed, typo

AB1908 commented 4 years ago

Massive work. Thank you so much. I'm not even sure I understand the entire script now but I'm sure it works well.

AB1908 commented 4 years ago

I hope you don't mind but could you rewrite "usage" and "dependency" sections of the readme? Feel free to include a link to your project as well under a Related Projects header.

Varstahl commented 4 years ago

Sure, I'll open another PR later.

Varstahl commented 4 years ago

I hope you don't mind but could you rewrite "usage" and "dependency" sections of the readme?

Available now in #24

I'm not even sure I understand the entire script now

I tend to have that effect on developers, sadly. I tried to comment as much as I could, but if you feel lost in any section or need any clarifications ask away.

AB1908 commented 3 years ago

I have found that certain users wish to just execute the script without having to fiddle with the arguments, akin to v0.1. Does it make sense to modify default args to ensure backwards compatibility?

Varstahl commented 3 years ago

The default tab works perfectly fine, it's a totally valid CSV and makes most sense in the context of Galaxy exports. The usual --all should work fine. Also there is no real default for CSVs, in several countries it defaults to ;, in others to ,, in others to \t, I just used the default that will work for everyone.

Reverting back to , will cause more harm then good, I think.

This is my opinion, at least.

AB1908 commented 3 years ago

I'm actually not advocating for reverting back to the old delimeter. My intention was to ask whether --all should be the default parameter for folks who don't want to be troubled and to ensure compatible "behaviour" with the earlier tag. Does this make sense? In case --all is already the default arg, it isn't working properly and I'll try to get around to making a fix for it.

Varstahl commented 3 years ago

Ah, that. That's… actually a good point. Yeah, we could make it export everything by default. I'll work on it a minute.

AB1908 commented 3 years ago

Aye many thanks. Take your time!

AB1908 / GOG-Galaxy-Export-Script

Parametrised export options and general improvements #23