NCrusher74 / SwiftTaggerID3

A Swift library for reading and editing ID3 tags
Apache License 2.0
5 stars 2 forks source link

Credits list removal functions #13

Closed NCrusher74 closed 4 years ago

NCrusher74 commented 4 years ago

This is the frame that we're handling as a [role: [person]] dictionary, and I'd like to add functionality for these items. I might be able to figure out the first and possibly the second on my own, but the third and maybe fourth will be tricky.

Update: Two down. The fourth was easier than I thought it would be.

NCrusher74 commented 4 years ago

Looking at that first item on the checklist, I keep waffling about whether or not it's even necessary. I think I'm looking at it from the perspective of what I'm seeing trying to make the unit tests pass, rather than how the user is going to use it.

The user has the ability to get and set the full [role : [ person ]]dictionary. Presumably, that means they're going to be printing it out in an interface somewhere, right? And in that interface, the user would just manually delete whatever item in the array they no longer wish to be there, and when they save, that new value will overwrite the existing value. So I probably don't need to worry about that?

(For that matter, this probably applies to the fourth item, as well.)

SDGGiesbrecht commented 4 years ago

Yes. Also, since the dictionary is available as a whole, the client can simply string more off the end of the line of code using any of the dictionary APIs to manipulate it to their heart’s content. For example:

wherever.theDictionary[someRole]?.removeAll(where: { $0 == "Someone" })
NCrusher74 commented 4 years ago

True. I think I'm just too close to the interior workings and it's making me overthink things. Which may be the case with this remaining issue as well.

I'm trying to decide the best way to handle the musician credits list frame for version 2.2 and 2.3.

I remember when working with ID3TagEditor I would get a fatal error when I tried to use a frame that wasn't available for whatever version I tried to write to. Happily, that isn't happening with SwiftTaggerID3 (at least not for the credits list frame) but since the process of adding frames to a tag is blind to whatever ID3 version we ultimately end up trying to write, that means all the frames are "available" but (again, at least in the case of the musician credits list frame) what ends up happening is...nothing. We don't get an error, but the frame doesn't write if the output version is incompatible with the frame.

I'm...honestly not sure how I should handle this. If it were just version 2.2 that doesn't support the MusicianCreditsList frame, I probably wouldn't bother. I mean, I'd like to minimize version incompatibility as much as possible, but I'm not sure if support for version 2.2 is in-demand enough to make it worth going to significant effort. I don't know about other tagging apps, but though Kid3 and Yate will read version 2.2, they won't write to it. With those apps, if you read in a version 2.2 tag and make changes, it will be replaced by version 2.3 or 2.4.

However, the MusicianCreditsList frame was newly added with version 2.4, and I know that version 2.3 is still popular enough that I should probably do...something. Either make it more obvious that if someone tries to write a MusicianCreditsList frame to version 2.3, nothing will happen, or find a way to convert whatever they put in the MusicianCreditsList frame to entries in the InvolvedPeopleList frame instead.

I have added this to the function in FrameProtocol where the id3Identifier gets encoded, and it's a start:

        var adjustedLayout = layout
        // if there's a musician credits list frame for an incompatible version, convert it to an involved people list frame
        if layout == .known(.musicianCreditsList) && (version == .v2_2 || version == .v2_3) {
            adjustedLayout = .known(.involvedPeopleList)
        } else {
            adjustedLayout = layout
        }

However, I suspect that if this happens and there's already an involvedPeopleList frame in the tag, one or the other will not be written. Also, because a lot of the items in the MusicianAndPerformerCredits enum aren't in the InvolvedPeopleCredits enum, the role becomes .none, and if there's more than one entry that gets converted this way, only one ends up being written (probably because it's happening AFTER our logic to append values to [person] if a role already exists.) I could probably solve that by merging the two role enums and using the merged enum for both frames, if necessary.

Or...I could be overthinking again. Maybe I shouldn't worry about it because it can be handled by whoever ends up using it? Or maybe I should just throw an error and call it a day. I don't know.

SDGGiesbrecht commented 4 years ago

It’s up to you. You could simply throw an error when an unsupported tag is specified for a particular version, and let the client decide what to do. Or you could invent some private use frame to divert the information into.

NCrusher74 commented 4 years ago

I'm trying to do something like this, but I'm still iffy on how static variables work.

// this would probably go in FrameProtocol or somewhere else external to frame instance?
    private static var creditsList: [String : [String] ] {
        get {
            return CreditsListFrame.credits
        }
        set {
            for key in newValue.keys {
                let value = newValue[key]
                CreditsListFrame.credits[key] = value
            }
        }
    }

// this (which is incomplete at the moment) would be the CreditsListFrame encode function
// if the layout/version combo is appropriate, it would store the credits data in the static variable
// or retrieve it from the static variable and add it to the credits dictionary
// then encode the dictionary
    func encodeContents(version: Version) throws -> Data {
        var frameData = Data()
        // append encoding Byte
        frameData.append(StringEncoding.preferred.rawValue)

        if self.layout == .known(.musicianCreditsList) && (version == .v2_2 || version == .v2_3) {
            for key in credits.keys {
                let value = credits[key]
                CreditsListFrame.creditsList[key] = value
            }
        }

        if self.layout == .known(.involvedPeopleList) && (version == .v2_2 || version == .v2_3) {
            for key in CreditsListFrame.creditsList.keys {
                let value = creditsList[key]
                credits[key] = value
            }
        }
        // encode and append each credit
        for key in credits.keys {
            frameData.append(key.encoded(withNullTermination: true))
            let valueString = credits[key]?.joined(separator: ",") ?? ""
            frameData.append(valueString.encoded(withNullTermination: true))
        }
        return frameData
    }
SDGGiesbrecht commented 4 years ago

static means there is only one variable shared by all instances of the type. (On a protocol, each conforming type would still have its own; it wouldn’t be shared across all conforming types.)

NCrusher74 commented 4 years ago

Okay. So it would be best to create the static var in the CreditsListFrame? That way, I can store to it when the CreditsListFrame instance is a MusicianCreditsList frame for versions 2.2/2.3, and read from it when the CreditsListFrame is InvolvedPeopleList.

But how do I ensure that the MusicianCreditsList frame gets processed before the InvolvedPeopleList frame?

NCrusher74 commented 4 years ago

I haven't tested it yet, but here's what I have so far?

    static var creditsList: [String:[String]] = [:]

    private func storeMusicianCreditsInEarlyVersions() {
        for key in self.credits.keys {
            let value = self.credits[key]
            CreditsListFrame.creditsList[key] = value
        }
    }

    private func convertMusicianCreditsToInvolvedPerson() -> [String: [String]] {
        var credits = self.credits
        for key in CreditsListFrame.creditsList.keys {
            let value = CreditsListFrame.creditsList[key]
            credits[key] = value
        }
        return credits
    }

    func encodeContents(version: Version) throws -> Data {
        var frameData = Data()
        // append encoding Byte
        frameData.append(StringEncoding.preferred.rawValue)

        if version == .v2_2 || version == .v2_3 {
            if self.layout == .known(.musicianCreditsList) {
                storeMusicianCreditsInEarlyVersions()
            }
            if self.layout == .known(.involvedPeopleList) {
                let dictionary = convertMusicianCreditsToInvolvedPerson()
                for key in dictionary.keys {
                    frameData.append(key.encoded(withNullTermination: true))
                    let valueString = credits[key]?.joined(separator: ",") ?? ""
                    frameData.append(valueString.encoded(withNullTermination: true))
                }
            }
        } else {
            // encode and append each credit
            for key in credits.keys {
                frameData.append(key.encoded(withNullTermination: true))
                let valueString = credits[key]?.joined(separator: ",") ?? ""
                frameData.append(valueString.encoded(withNullTermination: true))
            }
        }
        return frameData
    }
SDGGiesbrecht commented 4 years ago

I’m not sure I understand your strategy. If it works out, great.

If it doesn’t pan out, what I would do is simply register some private‐use frame identifier (Some identifiers are reserved for that right?) as the identifier for such a frame on earlier versions where no official identifier exists.

NCrusher74 commented 4 years ago

In the earlier ID3 versions, the InvolvedPeopleList frame performed the function that both the InvolvedPeopleList and MusicianCreditsList frames do in version 2.4, so I guess I was trying to return to that concept by converting the contents of a hypothetical MusicianCreditsList frame in version 2.2/2.3 and add them to the contents of the InvolvedPeopleList frame.

But the way I have it now, it would only work if the user also created an InvolvedPeopleList frame and I could ensure that the attempt to encode the MusicianCreditsList frame happened before the attempt to encode the InvolvedPeopleList frame. So, scratch that idea, I guess.

Since the TMCL identifier isn't in use for version 2.3, I suppose there's no reason why I couldn't use it anyway, and just make it clear in the documentation that not all other apps will recognize it. Then I can just make up an identifier for version 2.2.

NCrusher74 commented 4 years ago

Well. With that done...and I'm nervous even typing this...I think I may be finished. SwiftTaggerID3 has all the functionality I need, and then some. All the tests are passing, the one remaining issue has been flagged in the ReadMe because honestly I'm not sure it's worth the trouble of correcting.

I can't think of any more testing to do. Can you?

SDGGiesbrecht commented 4 years ago

I can't think of any more testing to do. Can you?

🤷‍♂️ You’ll find out if/when you discover a bug while you’re using it.

If you want to get an idea of how much of your library is backed by tests, you can look up how to turn on code coverage in Xcode.

NCrusher74 commented 4 years ago

Actually, I only noticed a few days ago that some of the comments you left in workspace.swift were actually specifically about SwiftTaggerID3 and not just general comments, and I had meant to ask about them and then got distracted. I know it has an enforceCoverage setting for both testing and documentation, both of which sound like a good idea in theory but I wasn't sure if I should mess with it?

NCrusher74 commented 4 years ago

For the most part things look okay. There are a lot of implicit closures in the StringFrame that aren't being tests, and I'm not sure how to correct that, because all of the StringFrame types were being tested, except in the case of a few of the "alias" getter-setters, where the same frame is just called by a different name.

SDGGiesbrecht commented 4 years ago

sound like a good idea in theory but I wasn't sure if I should mess with it?

Go ahead and toy with it. You can always revert with Git if it does something you don’t like.

Both those checks would take place during validate if you were to enable them.

NCrusher74 commented 4 years ago

So it looks like most of the tests that are being flagged as missing involve the empty-string fallbacks and so forth.

And I'm really not sure how to test for those, because in almost all cases, they're a fallback in case the string is nil, but when writing I can't create anything BUT a string there, so it's not going to be nil.

Other than that, the last frame remaining to be tested is the Unknown frame, and I have to admit, it's puzzling me.

It's not great, because in order to read the frames, it's converting them to UserTextFrame, but here's what I see when I try to print the tag:

Optional([SwiftTaggerID3.Frame.localizedFrame(frameKey: .userDefinedText(description: "v\u{05}") Optional("v\u{05}"): ), 
SwiftTaggerID3.Frame.localizedFrame(frameKey: .userDefinedText(description: "engTOS") Optional("engTOS"): ), 
SwiftTaggerID3.Frame.localizedFrame(frameKey: .userDefinedText(description: "Û") Optional("Û"): ), 
SwiftTaggerID3.Frame.localizedFrame(frameKey: .userDefinedText(description: "YiTG") Optional("YiTG"): 1), 
SwiftTaggerID3.Frame.localizedFrame(frameKey: .userDefinedText(description: "") Optional(""): ), 
SwiftTaggerID3.Frame.localizedFrame(frameKey: .userDefinedText(description: "㤮") Optional("㤮"): ††††), 
SwiftTaggerID3.Frame.stringFrame(frameKey: .album: "Test"), SwiftTaggerID3.Frame.localizedFrame(frameKey: .userDefinedText(description: "ttp://commercial.com") Optional("ttp://commercial.com"): )])

It looks like maybe some of the characters being used in the ID3 Identifier for a couple of the frames I'm running tests on (Yate only writes a handful of them) aren't being read as ASCII (which they should be?) so I'm hitting an encode error when trying to re-encode the Identifier. Which I can't avoid because the entire way things are structured is that we peel off the identifier in the Tag initializer, handle the frame, then add the identifier back in, encoded ASCII, when writing the frame. The contents are fine because we're not re-encoding them, just passing them through, but the identifier gets re-encoded.

And...I don't know why the identifiers aren't being read as ASCII. Here's the rawData from Yate:

00000027 00000041 00000014 (0000T0) PCNT 
00000041 00000065 00000024 (0000T1) OWNE 
00000065 00000078 00000013 (0000T0) MCDI 
00000078 00000100 00000022 (0000T0) RVRB 
00000100 00000116 00000016 (0000T0) POPM 
00000116 00000133 00000017 (0000T0) USER 
00000133 00000164 00000031 (0000T0) WCOM 

Here's what I see when I print(identifier) in theTag` initializer after converting it to ASCII:

PCNT
OWNE
MCDI
RVRB
POPM
USER
WCOM

But then when I use print(frame) just a few lines down in the Tag initializer:

localizedFrame(frameKey: .userDefinedText(description: "YiTG")
Optional("YiTG"): 1)
localizedFrame(frameKey: .userDefinedText(description: "")
Optional(""): )
localizedFrame(frameKey: .userDefinedText(description: "㤮")
Optional("㤮"): ††††)
localizedFrame(frameKey: .userDefinedText(description: "v\u{05}")
Optional("v\u{05}"): )
localizedFrame(frameKey: .userDefinedText(description: "")
Optional(""): )
localizedFrame(frameKey: .userDefinedText(description: "Û")
Optional("Û"): )
localizedFrame(frameKey: .userDefinedText(description: "engTOS")
Optional("engTOS"): )
localizedFrame(frameKey: .userDefinedText(description: "ttp://commercial.com")
Optional("ttp://commercial.com"): )

So it would see to indicate that the problem is being introduced in Frame, but I can't find anywhere in Frame where it would default to handling an unknown frame as a UserDefinedTextFrame or introduce those sorts of characters. Nor is there anything in FrameLayoutIdentifier, which was what I checked next.

So, yeah. I'm...baffled.

SDGGiesbrecht commented 4 years ago

It’s probably offset and pointing at the wrong section of data because something peeled off too much or too little from the front first.

SDGGiesbrecht commented 4 years ago

And I'm really not sure how to test for those, because in almost all cases, they're a fallback in case the string is nil, but when writing I can't create anything BUT a string there, so it's not going to be nil.

Usually that means the code can be reorganized so that the impossible situation isn’t a code path.

If not, you may not care about testing that section. There is nothing you can do about Xcode’s in‐file highlighting. But with Workspace, you can stick // @exempt(from: tests) somewhere on the first line of the untested zone to explicitly mark it as a section you don’t care about (I usually include the reason why if it isn’t obvious). That allows you to more easily read future reports, instead of needing to scan through a zillion irrelevant bits.

NCrusher74 commented 4 years ago

Our declared tag size is 164 bytes. Yate includes 10 bytes of padding, also, and calls it 174 bytes.

Using print(identifierBytes.hexadecimal() right after while !remainder.isEmpty in Tag...

54 58 58 58 // TXXX
50 43 4e 54 // PCNT
4f 57 4e 45 // OWNE
4d 43 44 49 // MCDI
52 56 52 42 // RVRB
50 4f 50 4d // POPM
55 53 45 52 // USER
57 43 4f 4d // WCOM

So, it looks like the identifiers are being parsed correctly. And I assume the frame size is being read correctly because each of the identifiers is being parsed correctly afterward?

The size bytes from `FrameProtocol(decodingFromStartOf:):

0 0 0 7 // TXXX frame with 1 encoding, description: "YiTG(null)" and a numeric string of "1"
// Yate confirms frame size: 17 = 10 header bytes + 7 content bytes

0 0 0 4 // This is the PCNT (play counter) frame, which the spec describes as follows:
/*    This is simply a counter of the number of times a file has been
   played. The value is increased by one every time the file begins to
   play. There may only be one "PCNT" frame in each tag. When the
   counter reaches all one's, one byte is inserted in front of the
   counter thus making the counter eight bits bigger.  The counter must
   be at least 32-bits long to begin with.

     <Header for 'Play counter', ID: "PCNT">
     Counter        $xx xx xx xx (xx ...)
*/
// So four bytes is right for that. Yate confirms 14 total bytes for this frame.

0 0 0 e // Yate confirms 24 bytes for this frame, which is the "Ownership" frame. 
// Spec is messy for this one.
/*
     <Header for 'Ownership frame', ID: "OWNE">
     Text encoding     $xx
     Price paid        <text string> $00
     Date of purch.    <text string>
     Seller            <text string according to encoding>
*/
// So there would be 14 bytes after the frame header. 1 encoding byte. 
// Price paid was entered as 9.99, so that's five more bytes when you add the null terminator. 
// That leaves 8 bytes unaccounted for, because I hadn't entered a purchase date or seller.

0 0 0 3 // Yate confirms 13 bytes for this frame, which is Music CD Identifier.
/*
     <Header for 'Music CD identifier', ID: "MCDI">
     CD TOC                <binary data>
*/
0 0 0 c // Reverb frame. I don't remember what I actually entered in this frame, 
// but Yate converted it to "120000000000000000000000". 
// Which is 24 digits, this frame is supposedly 12 bytes?

0 0 0 6 // POPM frame. Which I have no idea how they calculate
// I wrote to it by giving the file 5 stars.
/*
     <Header for 'Popularimeter', ID: "POPM">
     Email to user   <text string> $00
     Rating          $xx
     Counter         $xx xx xx xx (xx ...)
*/

0 0 0 7 // TOS frame, in which I just entered "TOS". 
/*
     <Header for 'Terms of use frame', ID: "USER">
     Text encoding        $xx
     Language             $xx xx xx
     The actual text      <text string according to encoding>
*/
// So, Encoding byte, + a 3-byte language code, + "TOS" That tracks.

0 0 0 15 // Commercial information website frame, which is "http://commercial.com" for 21 bytes, right on target.

So, those are all reasonable sizes for the frames in question.

Flags are all [0,0] as they should be.

And here's the contentData after the header has been parsed out:

0 59 69 54 47 0 31 // 7 bytes
0 0 0 21 // 4 bytes
1 39 2e 39 39 0 20 20 20 20 20 20 20 20 // 14 bytes
98 76 5 // 3 bytes
12 0 0 0 0 0 0 0 0 0 0 0 // 12 bytes
0 db 0 0 0 0 // 6 bytes
0 65 6e 67 54 4f 53 // 7 bytes
68 74 74 70 3a 2f 2f 63 6f 6d 6d 65 72 63 69 61 6c 2e 63 6f 6d // 21 bytes

So, those check out too. All the bytes are accounted for.

When I print the data out at this point...

        let tagDataRange = remainder.startIndex ..< remainder.startIndex + tagSize
        remainder = remainder.subdata(in: tagDataRange)
         // snip
        while !remainder.isEmpty {
            print(remainder) // should be only the frames data at this point

This is what I get:

54 58 58 58  |  0 0 0 7  |  0 0  |  0  |  59 69 54 47 0  | 31  // 17 bytes
50 43 4e 54  |  0 0 0 4  |  0 0  | 0 0 0 21  // 14 bytes
4f 57 4e 45  |  0 0 0 e  | 0 0  |  1  | 39 2e 39 39 0 20 20 20 20 20 20 20 20  // 24 bytes 
// encoding is `utf16WithBOM` for that one?
4d 43 44 49  |  0 0 0 3  |  0 0  |  98 76 5  // 13 bytes
52 56 52 42  |  0 0 0 c  |  0 0  |  12 0 0 0 0 0 0 0 0 0 0 0 // 22 bytes
50 4f 50 4d  |  0 0 0 6  |  0 0  |  0 db 0 0 0 0 // 16 bytes
55 53 45 52  |  0 0 0 7  |  0 0  |  0  |  65 6e 67  |  54 4f 53  // 17 bytes
57 43 4f 4d  |  0 0 0 15  |  0 0  |  68 74 74 70 3a 2f 2f 63 6f 6d 6d 65 72 63 69 61 6c 2e 63 6f 6d // 31 bytes
0 0 0 0 0 0 0 0 0 0 // yate's 10 bytes of padding

So, that accounts for the 174 bytes of the tag. Everything is in place. I don't think the padding could be throwing things off, because it happens at the end.

I...I don't know.

SDGGiesbrecht commented 4 years ago

Well I guess the next two things to try are:

  1. Convert a muddled identifier to bytes, and then search for that byte pattern in the data to see if it allows you to quickly identify where it is coming from.
  2. Use print to narrow in on the first line where a muddled identifier appears.
NCrusher74 commented 4 years ago

Okay I think I see what is happening. The problem isn't in the decoding stage, it's in the encoding stage.

When we "peel" the id3Identifier off at the beginning in Tag, we don't actually store that data anywhere intact. We just use it to initialize a FrameLayoutIdentifier instance, and from that point on, we refer to the frame not by the id3Identifier, but by the layout.

// from Tag
        while !remainder.isEmpty {
            // extract the identifier data
            let identifierBytes = remainder.extractFirst(version.identifierLength)
            // if the first byte is 0x00, it's padding, not a frame
            guard identifierBytes.first != 0x00 else {
                break
            }
            // convert identifier data to string
            let identifier = try String(ascii: identifierBytes)
            // pass the frames data over to `Frame` to determine which frame type will handle it
            let frame = try Frame(identifier: identifier,
                                  data: &remainder,
                                  version: version)
            // get the frame key
            let frameKey = frame.frameKey
            // add frame to dictionary
            frames[frameKey] = frame
        }

// from Frame
    init(identifier: String,
         data: inout Data.SubSequence,
         version: Version) throws {
        // use the identifier string to determine frame layout
        let layout = FrameLayoutIdentifier(identifier: identifier)
        switch layout {
        //
            case .unknown(identifier):
                self = .unknownFrame(try UnknownFrame(
                    decodingFromStartOf: &data,
                    version: version,
                    layout: layout))

And we've sort of got the identifier stored (in that the layout is .unknown(identifier), but we don't really use it again.

Then we get to the encoding stage, and it's time to reassemble the frame. But in most cases we don't actually have the identifier string or data. We just have a layout. But in the case of the known frames, because we know the layout, we know the identifier, so we recreate the identifier from the layout:

    /// Calculates and extracts the ID3 indentifier from the layout
    /// - Parameters:
    ///   - layout: the FrameLayoutIdentifier
    ///   - version: The version of the ID3 tag
    /// - Returns: The encoded identifier string
    private func identifierData(
        layout: FrameLayoutIdentifier,
        version: Version) -> Data {

        guard let identifier = layout.id3Identifier(version: version)?.encodedASCII(withNullTermination: false) else {
            switch version {
                case .v2_2: return "TXX".encodedASCII(withNullTermination: false)
                case .v2_3, .v2_4: return "TXXX".encodedASCII(withNullTermination: false)
            }
        }
        return identifier
    }

Except, for these unknown frames, the ID3 identifier isn't being reconstructed. So they're getting designated as UserDefinedText frames, and then the encoder tries to reassemble the frame from that assumption, and it's all a mess.

So the question is...why is this function not getting the identifier from the layout?

Our layout is .unknown(identifier) right? We call FrameLayoutIdentifer.id3Identifier to get the identifier, which is:

    func id3Identifier(version: Version) -> String? {
        switch self {
            case .known(let known):
                return known.id3Identifier(version: version)
            case .unknown(let identifier):
                return identifier
        }
    }

So, we SHOULD be getting that identifier back at this point, but we're not.

I'm still digging, but yeah, this is weird.

NCrusher74 commented 4 years ago

And the answer to why the identifier isn't being reconstructed here is because it's been lost somehow:

    private func identifierData(
        layout: FrameLayoutIdentifier,
        version: Version) -> Data {

        print(layout.id3Identifier(version: version)) // nothing prints
        guard let identifier = layout.id3Identifier(version: version)?.encodedASCII(withNullTermination: false) else {
            switch version {
                case .v2_2: return "TXX".encodedASCII(withNullTermination: false)
                case .v2_3, .v2_4: return "TXXX".encodedASCII(withNullTermination: false)
            }
        }
        return identifier
    }
NCrusher74 commented 4 years ago

Sorry, I'm spamming. Using commenting to you as my form of rubber duck debugging again.

So, we start out here in Tag:

            // extract the identifier data
            let identifierBytes = remainder.extractFirst(version.identifierLength)
            // if the first byte is 0x00, it's padding, not a frame
            guard identifierBytes.first != 0x00 else {
                break
            }
            // convert identifier data to string
            let identifier = try String(ascii: identifierBytes)
            // pass the frames data over to `Frame` to determine which frame type will handle it
            let frame = try Frame(identifier: identifier,
                                  data: &remainder,
                                  version: version)

We're peeling off the number of identifier bytes appropriate for the version, converting them to a string, and passing it all over to Frame to handle from there.

    init(identifier: String,
         data: inout Data.SubSequence,
         version: Version) throws {
        // use the identifier string to determine frame layout
        let layout = FrameLayoutIdentifier(identifier: identifier)
        switch layout {
//
            case .unknown(identifier):
                self = .unknownFrame(try UnknownFrame(
                    decodingFromStartOf: &data,
                    version: version,
                    layout: layout))

So, first, we're taking that identifier string and using it to initialize a layout:

    init(identifier: String) {
        if let known = KnownFrameLayoutIdentifier(identifier: identifier) {
            self = .known(known)
        } else {
            self = .unknown(identifier)
        }
    }

So we initialize an unknown layout and store the identifier string along with the layout data.

And then, back in Frame we send it on to the frame handler appropriate for that layout.

    init(decodingContents contents: Data.SubSequence,
         version: Version,
         layout: FrameLayoutIdentifier,
         flags: Data
    ) throws {
        self.flags = flags
        self.layout = layout
        self.frameKey = .unknown(uuid: self.uuid)
        self.contents = contents
    }

At this point, we still have the layout which means we still should have the identifier. Except we don't. A print of layout.id3Identifier(version:) yields nothing. So there is no identifier by the time we get here.

So backing up a step...

A command to print(identifier) in the FrameLayoutIdentifier.unknown initializer also yields nothing. So, the layout isn't even being initialized.

Backing up another step...

A command to print(layout.id3Identifier(version:) in the Frame initializer...results in all the frames being designated "TXXX". Whaaa?

Okay, so I was wrong. This is a problem in the decoding phase.

But backing up another step puts us back in the Tag initializer. But when I print the identifier string there immediately before initializing the Frame, I get:

TXXX PCNT OWNE MCDI RVRB POPM USER WCOM

So literally, between print(identifier) and let frame = the identifier is getting mangled. How?

            let identifier = try String(ascii: identifierBytes)
            // pass the frames data over to `Frame` to determine which frame type will handle it
            print(identifier)
            let frame = try Frame(identifier: identifier,
                                  data: &remainder,
                                  version: version)

The FrameLayoutIdentifier initializer receives the correct data:

    init(identifier: String) {
        print(identifier) // prints correctly
        if let known = KnownFrameLayoutIdentifier(identifier: identifier) {
            self = .known(known)
        } else {
            self = .unknown(identifier)
            print(self) // nothing prints
        }
    }

    func id3Identifier(version: Version) -> String? {
        switch self {
            case .known(let known):
                return known.id3Identifier(version: version)
            case .unknown(let identifier):
                print(identifier) // nothing prints
                return identifier
        }
    }

So, this is where it's happening. It looks like case .unknown isn't being initialized properly.

I think the problem may be in the KnownFrameLayoutIdentifier initializer, which is:

    init?(identifier: String) {
        self = KnownFrameLayoutIdentifier.stringToLayoutMapping[identifier] ?? .userDefinedText
    }

That's where the userDefinedText designation is coming from.

I can't find a way to get rid of that fallback. But I don't have anything in place for an unknown frame in KnownFrameLayoutIdentifier because I never expected it to receive any data for an unknown frame. Will work on that.

Yep, that was the problem:

    init?(identifier: String) {
        if let layout = KnownFrameLayoutIdentifier.stringToLayoutMapping[identifier] {
            self = layout
        } else {
            return nil
        }
    }

Easy fix after all. Now to see if it works.

NCrusher74 commented 4 years ago

Easy fix after all.

... or not.

// FrameLayoutIdentifier
    init(identifier: String) {
        if let known = KnownFrameLayoutIdentifier(identifier: identifier) {
            self = .known(known)
        } else {
            self = .unknown(identifier)
            print(self) // prints correctly
        }
    }

    func id3Identifier(version: Version) -> String? {
        switch self {
            case .known(let known):
                return known.id3Identifier(version: version)
            case .unknown(let identifier):
                print(self) // nothing prints
                print(identifier) // nothing prints
                return identifier
        }
    }

Ugh. How can the initialized layout (.unknown) be lost between the initializer and this function?

SDGGiesbrecht commented 4 years ago

I’m taking a look now.

SDGGiesbrecht commented 4 years ago

I’ve checked out this branch. What test should I run to trigger it?

NCrusher74 commented 4 years ago

Just a second, let me push the changes I've made.

NCrusher74 commented 4 years ago

Okay, I've pushed what I was working on, which is current up through my last comment.

The test I've been running just to get the print statements from various places is the final test in SwiftTaggerID3_Write_Tests called testUnknownFramePassThrough()

I think the problem may be in that, if I'm not mistaken, nothing is calling the frame-building initializer or giving it an identifier string for the UnknownFrame.

SDGGiesbrecht commented 4 years ago

I’m afraid I’m rather lost. When I run that test, it passes. When I uncomment print(tag) this is what I see (cleaned up a little):

Optional(
  [
    .unknownFrame(
      UnknownFrame(
        flags: 2 bytes,
        layout: .unknown("OWNE"),
        frameKey: .unknown(uuid: D92C8E33-4953-4BEA-94D9-BF0490871B9E),
        uuid: D92C8E33-4953-4BEA-94D9-BF0490871B9E,
        contents: 14 bytes
      )
    ),
    .unknownFrame(
      UnknownFrame(
        flags: 2 bytes,
        layout: .unknown("PCNT"),
        frameKey: .unknown(uuid: 2A49C337-B9E5-4D71-A28E-71207B205A15),
        uuid: 2A49C337-B9E5-4D71-A28E-71207B205A15,
        contents: 4 bytes
      )
    ),
    .unknownFrame(
      UnknownFrame(
        flags: 2 bytes,
        layout: .unknown("POPM"),
        frameKey: .unknown(uuid: 2BD13232-CBCE-4305-9B21-CB2A7069258E),
        uuid: 2BD13232-CBCE-4305-9B21-CB2A7069258E,
        contents: 6 bytes
      )
    ),
    .localizedFrame(
      frameKey: .userDefinedText(description: "YiTG")
      Optional("YiTG"): 1
    ),
    .unknownFrame(
      UnknownFrame(
        flags: 2 bytes,
        layout: .unknown("RVRB"),
        frameKey: .unknown(uuid: 306BF1F6-6DBA-420A-A0EA-FFC0AD4AC303),
        uuid: 306BF1F6-6DBA-420A-A0EA-FFC0AD4AC303,
        contents: 12 bytes
      )
    ),
    .unknownFrame(
      UnknownFrame(
        flags: 2 bytes,
        layout: .unknown("USER"),
        frameKey: .unknown(uuid: D6051AF6-CA34-429C-B8B8-3D58760594E8),
        uuid: D6051AF6-CA34-429C-B8B8-3D58760594E8,
        contents: 7 bytes
      )
    ),
    .unknownFrame(
      UnknownFrame(
        flags: 2 bytes,
        layout: .unknown("WCOM"),
        frameKey: .unknown(uuid: 74FB3640-153B-4C1E-B588-B5C03D793090),
        uuid: 74FB3640-153B-4C1E-B588-B5C03D793090,
        contents: 21 bytes
      )
    ),
    .unknownFrame(
      UnknownFrame(
        flags: 2 bytes,
        layout: .unknown("MCDI"),
        frameKey: .unknown(uuid: D5130A23-DEA2-4E28-AD03-94C72F2C7271),
        uuid: D5130A23-DEA2-4E28-AD03-94C72F2C7271,
        contents: 3 bytes
      )
    )
  ]
)

What part of that is not what it is supposed to be?

In fact, I can uncomment everything in that test and it still passes.

(I did adjust this the one line that does throw Mp3File.Error.CannotReadFile so that it re‐attempts once before actually throwing, since the system was occasionally denying access because it hadn’t yet finished releasing the file from a previous access attempt. Was that happening to you?)

NCrusher74 commented 4 years ago

Well, I feel a little sheepish. I actually hadn't tried the print(tag) in the test since I'd discovered the point in FrameProtocol where it appeared the identifier was being lost. I was trying to get a print at that point to display the identifier and hadn't gone back to the test to see what I was getting after everything was said and done.

I'm not encountering that retry issue with SwiftTaggerID3 (at least I haven't yet) but I do believe I'm encountering it when working on SwiftTaggerMP4 because there are times when I will run a test and it won't work, and then I'll run it again without changing anything and it will.

Is there a particular command to get it to retry before throwing the error?

SDGGiesbrecht commented 4 years ago

No. I was lazy because I was just testing and nested it once at the throw site, so that it would have to fail twice in a row to actually abort:

do {
  self.data = try Data(contentsOf: location)
} catch {
  do {
    self.data = try Data(contentsOf: location)
  } catch {
    throw Mp3File.Error.CannotReadFile
  }
}
NCrusher74 commented 4 years ago

Okay. I'm afraid to say this, but I should probably build some more rigorous testing for the PresetOptionsFrame before I close the book on all of this.

Yikes. I'm scared.

NCrusher74 commented 4 years ago

Okay, I think I've done about as much as I can do as far as unit tests are concerned.

There were some comments you made in Workspace.swift that I wanted to ask about:

// To my knowledge, SwiftTaggerID3 only supports macOS so far. This removes the others.

I definitely wouldn't mind it supporting other platforms, but I'm not sure what I need to do to make that happen.

// #workaround(Not sure which of these style guidelines the project wants to follow. These ones are disabled because they currently flag violations.) // These are rules provided by Workspace natively.

I'm not sure what these style guidelines do or if I need them?

I noticed that all the package stuff makes the project quite weighty (it went from something like 12MB to over 1GB), and I don't know how dependencies work later on down the line. Does all that code get added to my project, and then to anything that uses my project as a dependency?

SDGGiesbrecht commented 4 years ago

I definitely wouldn't mind it supporting other platforms, but I'm not sure what I need to do to make that happen.

You would just need to start testing on those platforms too, and adjust based on anything that fails. All the Apple platforms can be tested from macOS; the others would be more complicated.

I'm not sure what these style guidelines do or if I need them?

Style guidelines are never needed. But just like it’s nicer to read a novel or newspaper than the average text message, code is nicer to read and work on if it is arranged neatly and consistently. It’s mostly a matter of preference. People who turn off their spell checker turn into coders who avoid style‐consistency checks. People who use capitals and punctuation even in text messages turn into coders who enforce style checks before a feature is allowed to be merged.

I noticed that all the package stuff makes the project quite weighty (it went from something like 12MB to over 1GB), and I don't know how dependencies work later on down the line. Does all that code get added to my project, and then to anything that uses my project as a dependency?

A client won’t inherit any of it (unless you were to make your .product depend on it).

I only put it in the manifest so that you could use autocomplete while you were adjusting the configuration, but you don’t need that. If you remove it from the manifest, Xcode will no longer fetch any of it.

You can also reduce the space it takes up in your development directory by actually installing the tool using the instructions on its read‐me. If has been installed on the system, that will simply be used directly instead of cloning anything in your project’s .build directory. It only does that to make things more convenient for one‐time contributors who don’t want anything permanently installed. They can simply double‐click the script to get what they need, but since it’s all in the repository, it vanishes completely when they’re done. If you do install the tool, delete the .build directory to clean it out and free up space.

SDGGiesbrecht commented 4 years ago

I looked it up and my system install currently takes up 550 MB, but includes a lot of accumulated junk from years of updates. I believe it could be manually culled down to 40 MB safely if space is an issue for you.

NCrusher74 commented 4 years ago

Okay. Testing on other platforms is never something I've had to work with yet. I will see if I can figure that out.

Style guidelines are never needed. But just like it’s nicer to read a novel or newspaper than the average text message, code is nicer to read and work on if it is arranged neatly and consistently.

In another lifetime, I worked in publishing. I'm all for rigorous proofreading and tidy editing. I just need to figure out which, if any, of those style guidelines should apply here.

NCrusher74 commented 4 years ago

Space isn't an issue for me, I just wanted to make sure SwiftTagger remained as slim and streamlined as possible. ID3TagEditor was way bigger than it needed to be because of all the unnecessary stuff in there. Mp42Foundation is, as its own author admits, "overbuilt" and contains a whole bunch of stuff I don't need. My hope with SwiftTagger (ID3 and MP4) is to strike a balance between making tagging functionality convenient, while providing a tool that allows you to use all the same commands for ID3 and MP4.

Then I can fold all that into AudiobookTagger (the way I was going to do with ID3TagEditor and Mp42Foundation) so that I have an audiobook-centric tagging tool to use in AudiobookLiberator and onward. (nope, haven't forgotten those; all of this skipping around between projects still has a purpose.)

In some cases, achieving what I want will involve creating non-standard tags in one format because they exist in the other (you can see a lot of these in SwiftTaggerID3s LocalizedFrame get/set properties, because I created a bunch of Comment and UserDefinedText frames that will mimic MP4 tagging atoms.) I'll be doing the same on the SwiftTaggerMp4 side, creating tags modeled after ID3 frames that don't have an MP4 equivalent.

So, ultimately, by the time I go back to AudiobookTagger, what I'll have is three packages: SwiftTagger, which will have SwiftTaggerID3 and SwiftTaggerMP4 as dependencies and provide any bridging functionality necessary between them, and then just SwiftTaggerID3 and SwiftTaggerMP4 which can be used independently if cross-format functionality isn't needed. I'll be making all of those available for anyone who wants to fork or submit pull requests to increase the functionality.

SwiftTaggerMP4 will be a lot simpler. Virtually all the functionality I need is already part of AVFoundation, it's just not very intuitive to use and it's not easily compatible with SwiftTaggerID3.

NCrusher74 commented 4 years ago

Hm. Trying to use the style guidelines resulted in a recommendation to replace a bunch of hyphens and apostrophes with versions that broke things so that no changes would save anymore, so I think I'm better off not messing with that.

Which means...I think I'm pretty much done here, at least until I actually put using this into practice so I can put it through its paces. I'm going to merge this branch and catch up on SwiftTaggerMP4