Closed frugs closed 7 years ago
As of 6/7/17 this is still an issue. Is this an issue with the ability_lookup.csv being incorrect, the
I have never really understood this part of sc2reader. @GraylinKim do you have any remaining brain dumps about this that you could share with us?
It has been a while since I last looked at this I recall the primary fault is the shell command detailed in HOWTO
:
find Balance\ Data -print0 | xargs -0 grep -h '<ability' | sort | uniq | tr '="' ' ' | awk -v OFS=',' '{print $5, $3}' | sort -n
This generates a csv which only contains a mapping of ability links to ability ids, but ignores command indices.
Ideally, it should be a mapping of ability link and a command index to an ability id, and this information should be preserved when transformed in subsequent steps.
As an example, here is an extract from SCV.xml
in the balance data for protocol 48258:
<unit id="CommandCenter" ability="128" index="0">
...
</unit>
<unit id="SupplyDepot" ability="128" index="1">
...
</unit>
<unit id="Refinery" ability="128" index="2">
...
<unit id="Barracks" ability="128" index="3">
...
</unit>
And the matching generated lines in 48258_abilities.csv:
128,Barracks
...
128,CommandCenter
...
128,Refinery
...
128,SupplyDepot
Notice that the command indices have been dropped, so it is no longer possible to perform a correct lookup.
I see that. I am currently looking at the xml files that you get from following the HOWTO instructions. Each unit xml file has abilities within it. Further, each ability has a "command index" that tells what command within that ability is triggered.
Judging from the current build, it appears that it is only taking into account the ability ID, not the command ID (as you said). That said, it seems that the information is available and possible to retrieve. I am just not exactly sure that could fit into the current reader.
I believe that, eventually, there is some facility for combining an ability link and a command index into one value using some bit shifting (here is an extract from sc2reader/data/__init__.py
, starting from line 372):
int_id_base = int(int_id_base, 10) << 5
abils = ABIL_LOOKUP[str_id]
real_abils = [(i, abil) for i, abil in enumerate(abils) if abil.strip() != '']
...
for index, ability_name in real_abils:
...
build.add_ability(
ability_id=int_id_base | index,
...
If we could propagate the command index information to this point, I expect that this will fix this issue.
Hmm, I see that. That seems relatively convoluted. Maybe the format has changed, but even those "abilities" seem off that you listed above (given from the output of the shell command). It would appear to me that those indices are really pointing to an ability that all of those units share, such as BuildInProgress (which is index 127 in the most recent build for the buildings you listed above). Even fixing this error in labeling might help.
After digging in the files output from the game, I think I see a reason for the confusion and potentially the error. In the xml files, there are two possible formats for "abilities", at least according to what is in the csv files.
The first is within the abilites tags that many units have:
<ability id="BuildInProgress" index="127">
<command id="Cancel" index="0">
<meta name="71" icon="btn-command-cancel" hotkey="72" type="command" tooltip="73"/>
</command>
</ability>
<ability id="TimeWarp" index="116">
<command id="Execute" index="0">
<meta name="631" icon="btn-ability-protoss-protoncharge-color" hotkey="632" type="command" tooltip="633"/>
<cost cooldown="5"/>
</command>
</ability>
<ability id="que5Passive" index="277">
<command id="CancelLast" index="0">
<meta name="202" icon="btn-command-cancel" hotkey="91" type="command" tooltip="203"/>
</command>
</ability>
<ability id="RallyNexus" index="105">
<command id="Rally1" index="0">
<meta name="93" icon="btn-ability-terran-setrallypoint" hotkey="94" type="command" tooltip="95"/>
</command>
</ability>
Notice here how each ability has a different index, and within each ability there is a command. Each command has an index unique to its ability.
Then, we have the second, which is in the format you provided above. This applies to building/creating units, it seems. In this, each unit has an ability attribute that is not unique, but their indices are unique within that ability number. The index ability_lookup seems to be where it factors in the indices, but it only has one file (not one per build), so I am not sure how this can be correct. My guess is that the first column of each row is representative of the "id" field for each ability, while the subsequent columns represent the names that they get depending on their index.
Ah, I think I may have something further to add!
I believe there is only one ability_lookup.csv
because of an underlying assumption that abilities won't have their string ids changed, and that the command indices of commands on particular abilities will remain constant. This means that if you join the information contained in [build]_abilities.csv
and ability_lookup.csv
on the string ability id, you should be able to work out the command index and ability index for every command.
An example would be the following:
[build]_abilities.csv
128,builds
38,move
ability_lookup.csv
builds,CommandCenter,SupplyDepot,Refinery,Barracks...
move,Move,Patrol,HoldPos
This gives us the information we need to call
build.add_ability(
ability_id=int_id_base | index,
name=ability_name, # probably best rename this to command_id
is_build=bool(unit_name),
build_unit=getattr(build, unit_name, None),
build_time=build_time
)
in __init__.py
As an example, for the Patrol command we would want to do the following:
build.add_ability(
ability_id=(38 << 5) | 1,
name='Patrol',
is_build=false,
build_unit=None,
build_time=0)
The complications would be parsing the xml to allow us to interpret the <ability>
tags as well as the <trains>
and <builds>
tags (I am not sure if there are other alternatives, but I don't believe so).
So that makes sense to me. It seems that example only works for regular abilities. For the "train" abilities, which is really just training units, they have the same ability lookup depending on what builds them. So I am not quite sure how it decides which instance of 128 (using the example a couple comments up) it chooses.
Having a quick look at CommandCenter.xml
, Barracks.xml
and Starport.xml
, it looks like the <unit>
tags inside the <trains>
tags have different ability ids. If we treat the <trains>
tag as an ability with the ability name CommandCenter_trains
, Barracks_trains
and Starport_trains
respectively, there should be no problem.
So, an example again:
[build]_abilities.csv
154,CommandCenter_trains
158,Barracks_trains
160,Starport_trains
ability_lookup.xml
CommandCenter_trains,SCV
Barracks_trains,Marine,Reaper,Ghost,Marauder
Starport_trains,Medivac,Banshee,Raven,Battlecruiser,VikingFighter,Liberator
This will allow us to do something along the lines of:
for ability_id, ability_name in build_abilities:
commands = ability_lookup[ability_name]
for command_index, command_name in enumerate(commands):
build.add_ability(ability_id << 5 | command_index, command_name, ...)
There is some confusing nomenclature here with respect to ids, names and indices that could probably be cleared up here too.
It seems that would certainly be more clear than what is happening. Unfortunately, for whatever reason, this is not always the case. For example, the Nexus can train a probe (ability 175) and a mothership core (ability 313). I put together a python snippet that should compile the list (and unit list) and is more straighforward than the shell command. It looks for abilities, trains, builds, and researches tags. I didn't see any more that have "ability" attributes. Other than the last three.
I just considered backwards compatibility; this could require a bit more thought.
Yeah I wonder if the same format is kept, and the new csv files are added/updated, if it will fix the issue.
My guess is that the index is "preserved" by the label's position on the ability_lookup csv, but I am not sure.
Yeah I wonder if the same format is kept, and the new csv files are added/updated, if it will fix the issue.
I believe so, but it will require ensuring that the ability name in the older [build]_abilities.csv
files have their commands preserved in ability_lookup.csv
in order to ensure backwards compatibility with older replays.
My guess is that the index is "preserved" by the label's position on the ability_lookup csv, but I am not sure. I believe that this was the original intention based on the following line in
__init__.py
:real_abils = [(i, abil) for i, abil in enumerate(abils) if abil.strip() != '']
Here, the index of the column (ignoring the first column, as the first column is used as a key) that the command features in
ability_lookup.py
is being used as the command index.
Having had a quick look at WOL/18092_abilities.csv
, I notice that the ability names for the train
commands appear to be CommandCenterTrain
, BarracksTrain
, LarvaTrain
etc. Following this format should ensure backwards compatibility.
It seems that would certainly be more clear than what is happening. Unfortunately, for whatever reason, this is not always the case. For example, the Nexus can train a probe (ability 175) and a mothership core (ability 313). I put together a python snippet that should compile the list (and unit list) and is more straighforward than the shell command. It looks for abilities, trains, builds, and researches tags. I didn't see any more that have "ability" attributes. Other than the last three.
Looking this time at HotS/23925_abilities.csv
, it looks like there are three unique train abilities that do not fit the usual pattern: NexusTrainMothership
, NexusTrainMothershipCore
, and TrainQueen
. We may need to write some custom handling for these ability names in the new csv generation script.
It also looks like the Hatchery, Lair and Hive research abilities are keyed under the LairResearch
ability name, so this will probably also require some custom handling.
Yeah I looked back at it, and some of the "train" ones aren't working, even after I created a new CSV file for the newest build. I think the "train" and "research" are good options. And it preserves backwards compatibility because these changes should only take effect if the build is new.
I find it hard to believe that there will be no instances of name changes or index changes. There should really be a different ability_lookup for each new build or it should be an automated process. Like you said, the command indices are not automatically preserved, so right now you would have to go back and do it yourself for every possible command index, which makes it highly likely that will make a mistake.
There should really be a different ability_lookup for each new build
I completely agree with this, but I believe this should be an improvement made after resolving this issue.
the command indices are not automatically preserved
I'm not sure what you mean by this. Are you referring to the current csv generation pipeline? My suggestion is to fix the csv generation so that the command indices are preserved, in the form that the code in __init__.py
appears to expect them to be preserved (namely, in ability_lookup.csv
).
Yes. What I mean is that if you just use "CommandCenter" instead of "TerranBuild", it has an index that is not preserved, you have to manually do it yourself in the ability_lookup file (I'm agreeing with you that the names should be changed back to what they were before). Further, there should be a programmatic way to do it so it is never done incorrectly, because I think that is what happened in recent builds.
OK, then I think we are largely agreed on how we think the resolution of this issue should be progressed. I can start working on a PR for a script which will will generate correct [build]_abilities.csv
from the balance data xml (i.e. a replacement for the now-defunct shell command described in HOWTO
).
The next step would be to add a script which will be able to add any new commands found in new balance data xml files with the correct command indices to ability_lookup.csv
.
Finally, updating the HOWTO
with instructions on how to use these new scripts should resolve this issue.
It would then probably be wise to consider further improvements to the system overall (e.g. tie command indicies to balance versions, tie the data contained in train_commands.json
to balance versions, combine [build]_abilities.csv
and ability_lookup.csv
into one table etc.)
Given that I know no way of exporting older versions of the balance data, I will not aim for backwards compatibility for the csv generation scripts with them.
I will aim for the following situation:
generate_abilities.py
must always be able to generate a valid [build]_abilities.csv
from the latest version of the balance data. If the format of the balance data is changed, the script must also be updated.
generate_new_commands.py
must always be able to generate a valid ability_lookup.csv
from the latest version of the balance data and a previous ability_lookup.csv
.
Previously generated (correct) versions of [build]_abilities.csv
must still be compatible with any newly generated ability_lookup.csv
, allowing for sc2reader to correctly identify abilities used in replays using those balance versions.
@dsjoerg @ccopland94 First PR: https://github.com/ggtracker/sc2reader/pull/22
Outstanding @frugs!! Now that you've resolved this, does this help you with your own sc2reader-based project, or did you dig into this issue due to some problem related to the ggtracker website?
^^ Same question for @ccopland94
@dsjoerg I was running into issues with 6 months ago when I was working on my own sc2reader project intending to plot a player's "macro commands" in a one dimensional scatter plot against game time. The idea was to help help players compare where their macro starts faltering compared to a high-level player, and allow them to focus on concentrating on ensuring they perform "macro actions" more consistently.
Once this is fixed, hopefully I can reboot this project.
Yeah my usage was with a non-ggtracker project.
I reviewed the changes and I think the new file will serve it well. I think this solves the issue short term, but it would be great if it could 1.) put the csv files in the proper directories and 2.) either update the currently existing unit and ability lookups or use versioning for those too.
@frugs @ccopland94 excellent thanks!
@ccopland94 putting the csv files in the proper directories is a little tricky as I'd have to start assuming the directory from which the script is executed, which could be a fragile assumption.
I could perhaps make the sc2reader project root path a required input parameter to the script; would that be better?
That is true. I had assumed it would just be run from where you had it placed in the project directory.
@ccopland94 I have incorporated your suggestion 1) into https://github.com/ggtracker/sc2reader/pull/24
This issue can now be closed.
The the current
new_units.py
and the shell commands listed in theHOWTO
do not preserve the command index of the ability when populatingability_lookup.csv
. This means that the abilities with the same ability link but different command indices end up mapping to the same ability id.This can be easily verified by loading a replay (with game events) where the player builds a supply depot and a barracks. The supply depot command event will have the correct ability name, but the barracks command event will not.