kaitai-io / kaitai_struct_formats

Kaitai Struct: library of binary file formats (.ksy)
http://formats.kaitai.io
711 stars 204 forks source link

java_class: add tags since Java 9, add version checks #642

Closed KOLANICH closed 1 year ago

armijnhemel commented 1 year ago

Why references to openjdk instead of the official spec? https://docs.oracle.com/javase/specs/jvms/se19/html/jvms-4.html

KOLANICH commented 1 year ago

Thanks for the link, I'll fix it.

KOLANICH commented 1 year ago

Fixed

KOLANICH commented 1 year ago

Fixed

generalmimon commented 1 year ago

Fixed

@KOLANICH Sorry for being ambiguous with respect to formatting - I don't think it's necessary to format the "java_class" as inline code with monospace font in commit titles (as `java_class`: ... - please use plain java_class: ... instead). I'd like to see something similar to this:

icc_4: use correct repeat-expr (see section 10.15 of specification)
icc_4: fix type of curve type (u2 instead of u4, see section 10.5 of 
icc: add chromaticity_type which was referenced, but not defined in an 
Fix {warcraft_2_pud,dicom}.ksy to compile with KSC 0.10 
zisofs: add specification (#603) 
napp_header: rename to android_nanoapp_header, improve metadata
napp_header: shorten /doc-ref commit hash
napp_header: remove bitmask duplication in flag extraction
add the Android Nano app header
java_class: add reference for 43.0 as the minimum version
java_class: add sanity check for the version number as there is a clash 
Add log/mcap.ksy (#592) 
elf: sync with upstream binutils code (#606) 
dtb: use hyphen - to join words in /doc (not the UTF-8 en dash)
zip: update compression and extra_codes enums, sync with upstream APP… 
nt_mdt_pal: fix encoding - UTF-16LE, not UTF-16 
java_class: improve is_two_entries to prevent 'ConstantPoolEntry' obj… 
Merge pull request #593 from armijnhemel/dos_mz 
dos_mz: fix for kaitai struct compiler (the web ide doesn't complain 
dos_mz: correctly compute the body length
dos_mz: allow both MZ and ZM magic

I've been following this convention for a while in the KSF repo, and I suggest to do the same so that we have a consistent style in the commit history.

KOLANICH commented 1 year ago

I propose to have all the identifiers be marked with back ticks to make them rendered in markdown as identifiers. Those inline comments were created for things like that - the quotes from source code. An identifier 8s a quote from source code, so IMHO it should be marked as such.

KOLANICH commented 1 year ago

Also it'd be nice to use the same convention for the commits introducing formats:

uuid: Create the spec nt_mdt, nt_mdt_pal: Create the specs for NT-MDT formats

generalmimon commented 1 year ago

@KOLANICH:

I propose to have all the identifiers be marked with back ticks to make them rendered in markdown as identifiers.

Sorry, I will not negotiate on this specifically. This convention is used in this repo for a long time, it works well, I'm happy with it and I see absolutely no reason to change it.

Those inline comments were created for things like that - the quotes from source code. An identifier 8s a quote from source code, so IMHO it should be marked as such.

Feel free to (and I'd even encourage to) use backticks for YAML path keys (e.g. /doc-ref, encoding, /meta/xref, etc.), .ksy type/enum/field identifiers (e.g. header type, len_data field, machine enum, sh_type::nobits enum value), expressions in KS expression language (e.g. header.data_size == 0xffff_ffff), and so on.

Just the commit title has to start with the .ksy base filename without the .ksy extension, then a colon (:) and space , if the commit changes a single .ksy spec.

If you would like a more precise classification, this Python code may help:

import unittest
from pathlib import PurePath

def check_ksf_commit_title(title: str, changed_file_paths: list[str]) -> None:
    if len(changed_file_paths) != 1:
        return
    file_path = PurePath(changed_file_paths[0])
    if file_path.name.endswith('.ksy'):
        expected_prefix = file_path.stem + ": "
        assert title.startswith(expected_prefix), \
            f"expected title \"{title}\" to start with \"{expected_prefix}\", but it did not"

class TestCheckKsfCommitTitle(unittest.TestCase):
    def test_invalid_1(self) -> None:
        with self.assertRaisesRegex(AssertionError, "expected title.* to start with \"java_class: \""):
            check_ksf_commit_title("`java_class`: remove unneeded quotes", ['executable/java_class.ksy'])

    def test_invalid_2(self) -> None:
        with self.assertRaisesRegex(AssertionError, "expected title.* to start with \"java_class: \""):
            check_ksf_commit_title("Add more tags into `java_class`", ['executable/java_class.ksy'])

    def test_valid(self) -> None:
        check_ksf_commit_title("java_class: remove unneeded quotes", ['executable/java_class.ksy'])

if __name__ == '__main__':
    unittest.main()
generalmimon commented 1 year ago

@KOLANICH:

Also it'd be nice to use the same convention for the commits introducing formats:

uuid: Create the spec nt_mdt, nt_mdt_pal: Create the specs for NT-MDT formats

No, I specifically don't think this would be nice. Commits that add a .ksy spec have often the same title as the pull requests and I don't see a single PR that would spontaneously use it. Instead, you often want to include the format title in the commit headline when you're describing a new format (like for example Add android's DTB/DTBO partitions format).

It's also clear from your examples that after you start the commit message with an identifier of the .ksy spec you're adding, you don't have anything meaningful to write. So you had to add some filling ("create the spec"), but this filling could have many variants (all equally meaningless), for example "add", "add the .ksy", "create", "describe the format", ... (which is a great room for inconsistency). And also commits adding vs. modifying a spec are then not so easy to visually identify in the commit history.

KOLANICH commented 1 year ago

Instead, you often want to include the format title in the commit headline when you're describing a new format (like for example Add android's DTB/DTBO partitions format).

Yeah. In fact if we want to search for commits introducing or modifying a certain id, we can use git for that without putting additional data into commits.

Just the commit title has to start with the .ksy base filename without the .ksy extension

.ksy base filename without the .ksy extension is the contents of meta.id. It generates the python module with the same name, that is an identifier of the module to be imported.

It's also clear from your examples that after you start the commit message with an identifier of the .ksy spec you're adding, you don't have anything meaningful to write.

That's true. A first commit touching a certain spec is highly likely a commit introducing this spec. The commit message can be completely empty TBH, but it'd just look not very good within the changelog.

And also commits adding vs. modifying a spec are then not so easy to visually identify in the commit history.

They start with the words describing adding. Though you are right, commits adding a feature also start with these words.

This convention is used in this repo for a long time

Approximately since c1627d239cd493ed775ed352a29255e4da4c2ef1 it started being actively used.

Sorry, I will not negotiate on this specifically. it works well, I'm happy with it and I see absolutely no reason to change it.

OK.

KOLANICH commented 1 year ago

Thanks for the review.