Closed Varstahl closed 4 years ago
This is fantastic. I have a suggestion though and I'd like to hear your thoughts.
Would it be possible to modularise a little more? I'm seeing a lot of row['key'] = metadata['key']
and I feel like the argument key
could be passed to a separate function which would reduce the row=metadata instances.
I guess we could generalise that with a function that takes an enum as a parameter, so that it automatically tries to convert based on the input requirements.
Another thing that I thought of doing was defaulting every option to True
when -a
is specified, so instead of checking .all or .field
we could just check .field
. Then I thought that someone might not want to get a particular field always exported, so I haven't done it (yet, at least).
Also, partially marginal, I published this tool a couple days ago. I'm still working on it but it's working great imo, although there's some more work to do on both tools.
In any case it sounds reasonable, I was just about to start checking a couple things of the export procedure, I'll get on these changes right away.
@AB1908 implemented both and also fixed a problem with carriage returns, let me know if there's more to add.
Edit: while working on DLC filtering I found cases where the conversion from list to comma separated is not recommended and would require lots of custom parsing (ie: "whatever company, ltd"). I'm removing the list to string to let the CSV module do its thing.
@AB1908 I think I've done all in my power to fix the issues we had with DLCs and such. The only liberty I took was to replace sorted
with natsorted
. If you don't want the additional python dependency you can merge this PR and then replace the two instances of natsorted
with sorted
, and other than numbers sorted ASCII-betical, the rest will work just fine.
There is only one outstanding problem with the export, that is with items renamed as duplicate <rest of the title>
, but there's no easy way to fix this because it changes based on the user's libraries and platforms, so I'd rather not address that in the exporter, where it doesn't belong. Especially considering that some games listed as duplicates are the only entry to begin with.
Also, I didn't quite find a way to "detect" DLCs easily. TEST DEVELOPER(?: 2)?
and TEST PUBLISHER(?: 2)?
are used for actual CDPR games/DLCs, so I can't exclude them. Some of the DLC entries are also duplicate, therefore I couldn't filter them out by analysing games DLCs. I ended up doing both to the best of my abilities, excluding known DLC ids from the list, plus a few titles that kept popping up (ref: Varstahl@gle.py#L265-L273).
If you guys find a better way, it should be even better.
As for the rest, this PR has now all the QoL improvements I wanted from it, but I'm still up to adding/removing things if you need them, just let me know.
~Introduced a bug in the export somewhere down the line, don't merge it yet.~ fixed, typo
Massive work. Thank you so much. I'm not even sure I understand the entire script now but I'm sure it works well.
I hope you don't mind but could you rewrite "usage" and "dependency" sections of the readme? Feel free to include a link to your project as well under a Related Projects header.
Sure, I'll open another PR later.
I hope you don't mind but could you rewrite "usage" and "dependency" sections of the readme?
Available now in #24
I'm not even sure I understand the entire script now
I tend to have that effect on developers, sadly. I tried to comment as much as I could, but if you feel lost in any section or need any clarifications ask away.
I have found that certain users wish to just execute the script without having to fiddle with the arguments, akin to v0.1. Does it make sense to modify default args to ensure backwards compatibility?
The default tab works perfectly fine, it's a totally valid CSV and makes most sense in the context of Galaxy exports. The usual --all
should work fine. Also there is no real default for CSVs, in several countries it defaults to ;
, in others to ,
, in others to \t
, I just used the default that will work for everyone.
Reverting back to ,
will cause more harm then good, I think.
This is my opinion, at least.
I'm actually not advocating for reverting back to the old delimeter. My intention was to ask whether --all
should be the default parameter for folks who don't want to be troubled and to ensure compatible "behaviour" with the earlier tag. Does this make sense? In case --all
is already the default arg, it isn't working properly and I'll try to get around to making a fix for it.
Ah, that. That's… actually a good point. Yeah, we could make it export everything by default. I'll work on it a minute.
Aye many thanks. Take your time!
MasterList
aliases to avoid confusionIt's quite a bit to digest, and slightly overkill, but "in for a penny, in for a pound", am I right?
For the sake of keeping things DRY I created a couple mini wrappers that reduce the boilerplate needed for this thing to work, mainly the
ba()
,id()
andjsl()
functions, all tiny and documented. Also, to keep things independent without going crazy, I created a parametrisedprepare()
function, that setups the query and helper variables to be used when parsing results.For example, after the introduction of
prepare()
, when adding fields to the CSV we no longer have to track the results order manually, accessing themresults[int]
is not viable andresults[positions['name']]
avoids mistakes and debug headaches.There is also the option to provide CSV fields by parsing content already available from the query. ~While doing this I noticed that the
platformList
queries were completely unnecessary and redundant, so I stripped the code of them.~As for the
jsl()
I've added some parsing to be used withtitle
andsummary
, that cleans up a bit of\n
and<br/>
to makes the CSV more consistent and accessible.Let me know what you think.
Edit: noticed that the
platformList
was helpful in de-duping elements with diverging titles, so I reintroduced it.