snake-biscuits / bsp_tool

Python library for analysing .bsp files
GNU General Public License v3.0
102 stars 7 forks source link

Unified naming convention for lump names & LumpClass members #73

Open snake-biscuits opened 1 year ago

snake-biscuits commented 1 year ago

This will involve some light refactoring

A bunch of lump names differ across different branch scripts Many of these lumps share functionality, and even structs in some cases Unifying these different lumps to singular names with clearly conveyed use & context would really help documentation

This could also hugely help research & tracing the iteration of various small engine concept (e.g. .bsp geometry rendering)

I've already started doing a little of this and am already seeing some Quake 3 -> MoH:AA -> CoD -> Titanfall connections

Refer to the new "Etymology" page on the wiki for more info on the rationale behind this

TL;DR: make more code the same so we write less code and don't get confused as frequently

snake-biscuits commented 1 year ago

Common naming conventions should also mean branch_script methods will be more universal & reusable

snake-biscuits commented 1 year ago

Refer to the new "Etymology" page on the wiki for more info on the rationale behind this

The wiki page also lays out a style guide for lump class members Should probably copy all the LumpClass & branch_script style documentation to a formal style guide It would genuinely help reduce the volume of code, better connect engine branches & reduce general cognitive complexity

snake-biscuits commented 1 year ago

Making sure each LumpClass extends the LumpClass it builds on is probably a good idea For small changes anyway, switching from base.MappedArray -> base.Struct seems like a bad idea imo But being able to trace backwards and also re-use methods would be great

Might need to look at the arguments for simplifying LumpClass definitions in #74 too We could reduce a lot of code use, at the cost of more cognitive complexity to understanding some LumpClasses

# borderline illegible
class LumpClass(parent.LumpClass):
   """Extremely hard to parse; comparing as_cpp would be much easier"""
    origin: vector.vec3  # position of this LumpClass
    _format = _format.replace("h", "i").replace("H", "I")
    _format = "3f" + _format
    __slots__.insert(0, "origin")
    _arrays.update({"origin": [*"xyz"]})
    _classes.update({"origin": vector.vec3})

This is pretty core to why we need to unify in the first place, keeping member names consistent To maintain the current level of accessible documentation we need to either: 1) Generate a C/C++ header for each branch_script (no includes besides standard library!) 2) Test all LumpClass definitions are consistent with their ancestors (might have to build tools, rather than automate this) 3) Manually enforce a styleguide to documentation & LumpClass member names (do a full pass of all ~42 branch_scripts)

If going for 2, we will have to do 3 anyway to figure out what the tool needs to do, and find edge cases 3 will require some manual corrections, unless we write a master script & add .as_cpp() methods to SpecialLumpClasses SpecialLumpClass.as_cpp() would either have to be abstract comments or a read function, e.g.:

void readEntities(char *src, std::vector *dest) {
    while (src) {  // until end of raw lump bytes
        dest.push(parseEntity());  // "{\n\"key\" \"value\"\n}" -> std::vector
    }
}

Tbh, I think going for 1 would be a great bonus, but 3 is going to have to happen A lot has been learnt about how each engine branch works has been learnt from switching between scripts & manually updating Taking what we've learnt from one format and applying that knowledge to another script doesn't seem easy to automate Doing so manually would ensure we don't miss anything and as is stated is 2, would be nessecary to do 3 anyway

TL;DR, Inheritance is a useful tool, but it has to be used correctly. We can gain a lot, at the cost of turning everything into a tower of cards

snake-biscuits commented 1 year ago

Unification is more of a Quality of Life thing anyway, it makes using bsp_tool cognitively simpler, which it should be. But this only applies to cases where the user is working with a slightly unfamiliar branch When the similarities are clearer, the easier it will be for users to recognise the differences

Going forward, build documentation around docs/supported, since that's most user's entry point into understanding .bsp files

Copy the lump changes & relationship docs from each branch script! Maybe even give them consistent names & formats, so users can peek them from a REPL

# in developer.branch
__parent__: str = "developer.parent_branch"  # module.__branch__ + module.__name__ (at bsp_tool.branches level)
# __changes__: Dict[str, Dict[str, str] | Set[str]]
__changes__ = {"renamed": {"FROM": "TO"}, "new": {"NEW_LUMP"}, "deprecated": {"OLD_LUMP"}}

__relations__: Dict[str, Set[str]] = {"LUMP_NAME": {"INDEXED_LUMP"}}
# TODO: function to generate nice looking ascii relationship diagrams for each branch_script
# -- or a mermaid script to generate an ERD svg diagram
# TODO: generate from / extend with LumpClass member names & annotation comments (e.g. Face.plane -> Plane)

0.5.0 idea

branch_script.__relations__ + unified member names could allow for easier indexing, e.g.:

bsp.LUMP[i].other_lump.another_lump.data

This would probably require some BspLump trickery where we link LumpClasses at runtime, e.g.

bsp.get("model[0].meshes[0].material_sort.texture_data.name")
# behind the scenes:
model = bsp.MODELS[0]
meshes = bsp.MESHES[model.first_mesh:model.first_mesh + model.num_meshes]
material_sort = bsp.MATERIAL_SORT[meshes[0].material_sort]
texture_data = bsp.TEXTURE_DATA[material_sort.texture_data]
return bsp.TEXTURE_DATA_STRING_DATA[texture_data.name]

Actually, if we use a different naming convention, we have another option:

bsp.model[0].meshes[0].material_sort.texture_data.name  # do the same as get method above

This approach should probably wrap a BspClass.get(address: str, start: Any = None) -> Any method start would provide an avenue to use a search / filter to collect objects first Would allow for quickly finding the names of all textures with a certain set of Surface flags set, for example.

TODO: maybe break 0.5.0 idea into it's own Issue, 0.4.0 is huge enough as is

snake-biscuits commented 1 year ago

VIS Cell & Portal Area & AreaPortal visclusters & Leaves

How do they work?

snake-biscuits commented 1 year ago

Like this?: https://youtube.com/watch?v=HQYsFshbkYw

A stencil buffer & winding tests could be used to draw visible spaces closest first by using depth & occlusion tests on portals before rendering the contents of each leaf

snake-biscuits commented 1 year ago

Just noticed shared.Ints BasicLumpClasses are all plural, while LumpClasses are singular Though it is worth noting that shared.Ints doesn't have the same features as base.Struct shared.UnsignedShort.from_stream(...) could be handy in a few places

Tbh there might be a better way of doing BasicLumpClasses, as they have become exclusively integer types quake.Edge was attempted at one point, but didn't work out.

SpecialLumpClasses are a mix of singular & plural (makes sense, some are lists, others are a single & complex object)

snake-biscuits commented 1 year ago

Trying to apply this principle to extensions.archives Ideally each archive interface should behave like the built-in zipfile.ZipFile class

Though this does mean we can't use the .from_file @classmethod alternate __init__

snake-biscuits commented 1 year ago

Might be worth refactoring BspClasses to use .from_file & .from_stream @classmethod __init__s Would make #21 much easier, as well as making the process of creating a new (empty) .bsp clearer

Should also help with catching extraneous related files (see #104), as we can't use os.listdir inside an archive I would like towards feeding bsp_lump.load_bsp a filepath like: .../Dreamcast/disc_images/Quake3.iso/maps/tdm10.zip/maps/tdm1.bsp

Being able to grab a .bsp from nested archives would be much nicer than having to extract everything beforehand

snake-biscuits commented 7 months ago

We should move branch methods from x_of_y -> y_x e.g. lightmap_of_face -> face_lightmap

snake-biscuits commented 7 months ago

How should branch splits be named? nexon.vindicuts / nexon.vindictus69 could be vindictus_steam / vindictus cso2 / cso2_2018 communicates the era, but is there a clearer name in the history?

apex_legends / apex_legends13 split in season13 But we just found that CSMAABBNode changed on rBSP v49 -> v50 during season10 (this gets extra messy w/ season11 depots including v49.1)

apex_legends / apex_legends50 / apex_legends51 makes sense now And this naming convention doesn't work for the nexon splits, because they didn't increment map version

I wish devs would just used lump versions Do they think they have to support all older lump version? is that why?

snake-biscuits commented 5 months ago

TODO: Face.light_offset -> Face.lighting_offset in Source Engine Face LumpClasses

snake-biscuits commented 4 months ago

https://github.com/snake-biscuits/bsp_tool/commit/ce8b18d47de6ac538556d747538f5a398502a02a brings respawn.titanfall & respawn.titanfall2 PrimitiveType together This struct appears in both GeoSet & Primitive as .child.type & .type Commit message doesn't acknowledge this because I accidentally shipped it w/ a base.BitField fix