Multi stage id3 writing

NCrusher74 commented 4 years ago

I've got to run out the door to work, but I got the AudiobookTag portion of this done. I think. I'm only guessing and I haven't had a chance to test it, but all the cases return an OutcastID3TagFrame at any rate?

I had to include an enum for whether it's reading or writing, however, because the parameters for the OutcastID3TagFrame differ, depending on if the frame is being parsed or written to.

There's one variable that's being unused, which is dataString and that's probably going to be a sticking point. If the data is being "passed through" then what's in there needs to be written to the output, but if the content of the frame is being edited or added, it doesn't and then the frameContent parameter gets invoked instead.

It makes no accounting for the fact that some of these frames are going to need to be handled as Int or Array or Date yet, either.

It's messy. But I'm hoping I'm on the right track with it.

NCrusher74 commented 4 years ago

I tried removing the cases I had added to the StringFrame enum of OutcastID3 to see if those were causing the problem, but even just the most basic test of OutcastID3 functionality isn't working.

I'm getting "XCTAssertEqual failed: ("Artist

NCrusher74 commented 4 years ago

Oh this is weird. My comment keeps getting cut off. Long story short, OutcastID3 isn't working. Can you get it to work?

NCrusher74 commented 4 years ago

Okay, so, after restarting my computer and cleaning my build folder for the umpteenth time, it's writing again, and sort of working as intended. Except for a couple strange quirks.

Part of what I'm writing is these two frames:

            OutcastID3.Frame.CommentFrame(
                encoding: .utf8, language: "eng",
                commentDescription: "description",
                comment: "Comment Test"),
            OutcastID3.Frame.TranscriptionFrame(
                encoding: .utf8, language: "eng",
                lyricsDescription: "description",
                lyrics: "Lyrics Test")

and then I'm running these two tests:

            // ...other tests on the StringFrame type
            } else if let isolatedFrame = frame as? OutcastID3.Frame.CommentFrame {
                XCTAssertEqual(isolatedFrame.language, "eng")
                XCTAssertEqual(isolatedFrame.commentDescription, "description")
                XCTAssertEqual(isolatedFrame.comment, "Comment Test")
            } else if let isolatedFrame = frame as? OutcastID3.Frame.TranscriptionFrame {
                XCTAssertEqual(isolatedFrame.language, "eng")
                XCTAssertEqual(isolatedFrame.lyricsDescription, "description")
                XCTAssertEqual(isolatedFrame.lyrics, "Lyrics Test")
            }

Which is getting me this result:


/Users/nolainecrusher/Documents/Git/AudiobookTagger/AudiobookTaggerTests/OutcastID3Tests.swift:124: error: -[AudiobookTaggerTests.OutcastID3Tests testOutcastFunctionality] : XCTAssertEqual failed: ("

NCrusher74 commented 4 years ago

Okay seriously, why are my comments getting cut off? And when I try to repost them they get cut off as well.

I upgraded my Mac last night? Maybe FireFox doesn't like the upgrade?

At any rate, trying again. Here is the test failure I see:


XCTAssertEqual failed: ("

NCrusher74 commented 4 years ago

Okay, I think maybe the reason my comments are getting cut off may be the same reason my tests are failing. They seem to be getting cut off where the extra character is lurking when I cut and paste:

(and here's the screenshot that didn't make it into that post)

I've gone through and tried every kind of encoding I recognize instead of .utf8 and none of them improve the result.

Since this seems to be an OutcastID3 issue, I've opened an issue. But if they don't end up getting back to me, I'm sort of well and truly stuck.

SDGGiesbrecht commented 4 years ago

It looks like there are invisible control codes lurking in the text. If the 0000 and 0003 are hex for the Unicode scalars, those are ASCII controls for “null” and “end of text”. It would guess that the string is being read starting two bytes early. Does either byte mean anything in the binary file specification? “Null” is often used to mark the end of a string instead of storing its length along side it. And “end of text” looks like it would be used as a data separator. On the other hand, it could just as easily be a 32‐bit, little‐endian integer with the value of “3”.

If you don’t have access to the low‐level data handling to fix the read position, you might be able to work around it by lopping off all leading characters that are less than space (hex scalar value U+0020).

NCrusher74 commented 4 years ago

I'm not able to load id3.org from here right now (It's been giving me a bad gateway error the last couple days) but I know what that is supposed to be, I think.

The thing that has stumped me as far as trying to recreate those two particular tags in ID3TagEditor wasn't the language code, it was actually the terminator for the first of the two strings.

There are three elements to that tag. The language code, in the form of the 3-letter ISO-639-2 code, a description string that is supposed to be terminated somehow, and then the actual content.

This is the description as pertains to version 2.2, but 2.3 and 2.4 are pretty much the same:

Comments

This frame replaces the old 30-character comment field in ID3v1. It consists of a frame head followed by encoding, language and content descriptors and is ended with the actual comment as a text string. Newline characters are allowed in the comment text string. There may be more than one comment frame in each tag, but only one with the same language and content descriptor.

Comment "COM" Frame size $xx xx xx Text encoding $xx Language $xx xx xx Short content description $00 (00) The actual text

So I suspect that short comment description terminator is somehow ending up at the beginning of the actual text?

SDGGiesbrecht commented 4 years ago

Something like that, yes.

NCrusher74 commented 4 years ago

That just leaves me wondering what to do about it.

SDGGiesbrecht commented 4 years ago

Where is the implementation? It’s in another repository, right?

NCrusher74 commented 4 years ago

You mean my implementation? Everything I'm working on right now is here in this branch.

SDGGiesbrecht commented 4 years ago

For some reason I mistakenly thought CommentFrame was something you had added, so that is what I was asking about. But after hunting for it I realized it is from upstream.

Then unless you want to try to fix it for them, my workaround suggestion above is probably the best way forward.

NCrusher74 commented 4 years ago

Sorry, no. It was ID3TagEditor that I was trying to add it to.

But it is looking like, unless I can find a way to get ID3TagEditor and OutcastID3 to work together, I'm going to have to migrate everything to OutcastID3 because it has all the frames I need and ID3TagEditor doesn't.

Actually, I think I'm going to have to do that anyway, because OutcastID3 doesn't like the encoding of frames written by ID3TagEditor. At least, I suspect that is what was happening when it was telling me that "Artist" was not equal to "Artist."

Unfortunately, migrating means I'm going to break all the stuff you did with the date formatting.

SDGGiesbrecht commented 4 years ago

I suspect that is what was happening when it was telling me that "Artist" was not equal to "Artist."

It depends whether the error was during the read (gibberish was written to the file) or the write (the string includes some extra data on the front end). In the second case, you can just ignore the leading junk each time you read.

Try printing string.unicodeScalars.map({ $0.value }) to get the UTF‐8 of the two “Artist” tags. That will let you see the invisible characters.

NCrusher74 commented 4 years ago

Okay, give me a few to try to set that back up. I had deleted that test.

NCrusher74 commented 4 years ago

This test:

        for frame in outcastFrames {
            if let isolatedFrame = frame as? OutcastID3.Frame.StringFrame {
                if isolatedFrame.type == AudiobookTag.authors.outcastType {
                    XCTAssertEqual(isolatedFrame.str, "Artist")
                    print((isolatedFrame.str).unicodeScalars.map({ $0.value }))
                    print("Artist".unicodeScalars.map({ $0.value }))
                    }
                }
            }
        let id3TagEditor = ID3TagEditor()
        if let id3Tag = try id3TagEditor.read(from: Bundle.testMp3FullMeta.path) {
            print((id3Tag.frames[.Artist] as? ID3FrameWithStringContent)?.content.unicodeScalars.map({ $0.value }))

        }

Gets me these results:

/Users/nolainecrusher/Documents/Git/AudiobookTagger/AudiobookTaggerTests/OutcastID3Tests.swift:26: error: -[AudiobookTaggerTests.OutcastID3Tests testMP3AudiobookTag] : XCTAssertEqual failed: ("Artist[X]") is not equal to ("Artist")
[65, 114, 116, 105, 115, 116, 0]
[65, 114, 116, 105, 115, 116]
Optional([65, 114, 116, 105, 115, 116])

note* X marks the extra character I removed so that the post wouldn't get cut off.

So it looks like Outcast is adding the character upon reading, not that it's being written there?

SDGGiesbrecht commented 4 years ago

Yes, it looks like it is erroneously including the trailing “null”/0 that is there to mark the end of the string. If you remove any trailing 0 bytes from isolated.str after you read it but before you actually do anything with it, everything should be fine.

NCrusher74 commented 4 years ago

I will research how to do that. this is my first time ever working with text encoding so I know pretty much nothing.

NCrusher74 commented 4 years ago

(Also, I think Outcast just may be a hot mess. It doesn't like writing to my "No Meta" MP3 file, either. But it will write to the one that has metadata on it.

NCrusher74 commented 4 years ago

Edit: Nevermind. I found the answer. For this one at least.

NCrusher74 commented 4 years ago

Okay, so I managed to remove the null characters using this:

                    let trimmedString = (isolatedFrame.str).replacingOccurrences(of: "\0", with: "")
                    XCTAssertEqual(trimmedString, "Artist")

And it worked for the null character at the beginning of the TranscriptionFrame and CommentFrame content strings as well. But none of these special characters:

\0 (null character)
\\ (backslash)
\t (horizontal tab)
\n (line feed)
\r (carriage return)
\" (double quote)
\' (single quote)

is the other (end of text) character. So I still need to get rid of that one. Or figure out how to implement your suggestion about the characters smaller than space?

NCrusher74 commented 4 years ago

I'm going through the OutcastID3 code to see if I can fix this issue without jumping through so many hoops.

The good news is, I found the bug that was causing the Transcription/Lyrics frame content to write to the Comment frame. Looks like they just c/p the code from the Transcription frame and missed one point where they were supposed to replace Transcription with Comment. But that's another issue.

Here is what they have:

extension OutcastID3.Frame.CommentFrame {
    public static func parse(version: OutcastID3.TagVersion, data: Data, useSynchSafeFrameSize: Bool) -> OutcastID3TagFrame? {

        var frameContentRangeStart = version.frameHeaderSizeInBytes

        let encoding = String.Encoding.fromEncodingByte(byte: data[frameContentRangeStart], version: version)
        frameContentRangeStart += 1

        let languageLength = 3
        let languageBytes = data.subdata(in: frameContentRangeStart ..< frameContentRangeStart + languageLength)

        guard let language = String(bytes: languageBytes, encoding: .isoLatin1) else {
            print("Unable to read language")
            return nil
        }

        frameContentRangeStart += languageLength

        let commentDescription = data.readString(offset: &frameContentRangeStart, encoding: encoding, terminator: version.stringTerminator(encoding: encoding))

        let comment: String?

        if frameContentRangeStart < data.count {
            let commentData = data.subdata(in: frameContentRangeStart ..< data.count)
            comment = String(data: commentData, encoding: encoding)
        }
        else {
            comment = nil
        }

In the process of looking up what the encodingByte was I found this:

    var encodingByte: UInt8 {
        switch self {
        case .utf8: return 0x3
        case .utf16: return 0x1
        default: return 0x0
        }
    }

Could this be the source of the other byte I can't find a way to get rid of? The one we thought was end of text?

NCrusher74 commented 4 years ago

So, this is what I managed to cobble together to make the test pass:

            } else if let isolatedFrame = frame as? OutcastID3.Frame.TranscriptionFrame {
                XCTAssertEqual(isolatedFrame.language, "eng")
                XCTAssertEqual(isolatedFrame.lyricsDescription, "description")
                var stringToTrim = isolatedFrame.lyrics.unicodeScalars.map({ $0.value })
                let bytesToRemove: [UInt32] = [0,3]

                for byte in bytesToRemove {
                    for (index, value) in stringToTrim.enumerated() {
                        if value == byte && stringToTrim.contains(byte) {
                            stringToTrim.remove(at: index)
                        }
                    }
                }
                let trimmedString = String(codeUnits: stringToTrim, codec : UTF32())!
                XCTAssertEqual(trimmedString, "Lyrics Test")

(that last bit is an extension to String I found here for converting Uint32 back to String)

I feel like maybe this is the long way around and there could be a simpler way? (Especially since I apparently have to do this with every single item it reads in.) But I couldn't find it, and this does work. Assuming no other characters jump out to surprise me along the way.

SDGGiesbrecht commented 4 years ago

Or figure out how to implement your suggestion about the characters smaller than space?

You can enter any character—defined or not—as an escape sequence with \u{41}, where the digits in the braces are the hexadecimal code. The 41 in the example is the capital A.

You also have access to the Unicode scalars of the string with this: "some string".unicodeScalars. That is a Collection of Unicode.Scalar instances with most of the same methods as an Array:

var string = "some string"
string.unicodeScalars.remove(where: { $0.value < 0x20 })

(Here 0x... is writing an integer in hexadecimal instead of decimal.)

Note: The solution you found does work properly, it just does more work than necessary.

Could this be the source of the other byte I can't find a way to get rid of? The one we thought was end of text?

Yes. It is most likely including a UTF‐8 flag inside the string by accident. It means the only garbage bytes you’re likely to see are all 0x03 or less (which is a good thing; you won’t get data that just happens to pass for a letter).

NCrusher74 commented 4 years ago

I tried about a dozen variations on that last night and couldn't get the syntax right.

...and apparently I still can't.

                    var stringToTrim = isolatedFrame.str
                    let trimmedString = stringToTrim.unicodeScalars.remove(where: { $0.value < 0x20 })

gets me Incorrect argument label in call (have 'where:', expected 'at:') where the recommended "fix" causes another error about the bool being ambiguous.

I don't think where: comparisons like me very much. I'm not kidding when I say I tried that over and over last night in different ways with no luck. I always end up with some error.

NCrusher74 commented 4 years ago

Okay, finally figured it out.

                if isolatedFrame.type == AudiobookTag.authors.outcastType {
                    var stringToTrim = isolatedFrame.str
                    if let index = stringToTrim.unicodeScalars.firstIndex(where: { $0.value < 0x20 }) {
                        let trimmedUnicode = stringToTrim.unicodeScalars.remove(at: index)
                        let trimmedString = String(trimmedUnicode)
                        XCTAssertEqual(trimmedString, "Author Write Test")
                    }
                }

Which is a little more compact than what I was doing.

Now I just need to figure out how to "pass-thru" the values, so that I can use OutcastID3 to read what ID3TagEditor outputs, add its own frames to the tag, and output the final result.

SDGGiesbrecht commented 4 years ago

Incorrect argument label in call (have 'where:', expected 'at:')

Sorry, that was my fault. I typed it wrong. The method name is removeAll, not remove:

var string = "some string"
string.unicodeScalars.removeAll(where: { $0.value < 0x20 })

NCrusher74 commented 4 years ago

Ah ok, got it, thank you.

So. All my tests are passing, but they really shouldn't be, especially the last group of them, because the final output file is coming out with no metadata written to it at all.

I suspect I'm doing it wrong to use:

        var fixedFrames: [OutcastID3TagFrame]?
        // ... stuff
                    fixedFrames?.append(OutcastID3.Frame.StringFrame(
                        type: AudiobookTag.authors.outcastType!,
                        encoding: .utf8,
                        str: fixString(string: isolatedFrame.str)))
     // ... more stuff
            // write to final output file with OutcastID3
            let outcastTag = OutcastID3.ID3Tag(
                version: .v2_4,
                frames: fixedFrames ?? []
            )
            let outputUrl = URL(fileURLWithPath: NSHomeDirectory() + "/outcast-PassThru-testfile.mp3")
            try outcastFile.writeID3Tag(tag: outcastTag, outputUrl: outputUrl)

but that doesn't explain why the last bunch of tests are passing when the file they're reading from has nothing written to it.

SDGGiesbrecht commented 4 years ago

You mean these tests are passing?

for frame in outcast2Frames {
  if let isolatedFrame = frame as? /* ... */ {
    if isolatedFrame.type == /* ... */ {
      XCTAssertEqual(/* ... */)
    }
  }
  // ...
}

If the file is empty, the loop runs 0 times and none of the tests inside it take place. With no failures, the test passes.

To catch it, add this:

XCTAssertFalse(outcast2Frames.isEmpty, "No frames.")

As to why nothing is being written at all, it’s this line:

var fixedFrames: [OutcastID3TagFrame]?

It creates a variable that can hold an array or nothing. By default it holds nothing. Then your attempts to append do this:

fixedFrames?.append(/* ... */)

The question mark means if there is an array in fixedFrames, add to it. So there is no array and nothing is ever added.

Pull the question mark off the variable and start with and empty array instead:

var fixedFrames: [OutcastID3TagFrame] = []

NCrusher74 commented 4 years ago

Ahh ok thank you.

NCrusher74 commented 4 years ago

So, just as commentary, in an effort to check how these tags I'm writing are going to work out when read by other apps, I opened them in a variety of tagger apps, an audio editor app called Fission, and Itunes/Music, and left notes to myself in comments in my test file because that was what was handy.

And now I'm going to scream because these comment and lyrics tags I've been fighting with for days? Aren't recognized by much of anything. Meanwhile, whatever Fission writes to comments and lyrics IS recognized, and I have no idea what tag that is.

    // MediaInfo reads what should be lyricsContent from lyricsDescription instead
    // MediaInfo does not appear to read the Comment tag. It reads the Subtitle ("description" in OutcastID3) as "Track_More"

    // Itunes/Music does not appear to read anything as a Comment from the tags provided
    // MediaInfo doesn't read whatever Music DOES write to in Comments, nor does Kid3

    // Kid3 doesn't appear to read anything from either Comments or Lyrics. It will read Subtitle and PodcastDescription

    // Yate doesn't appear to read Comments as written by Outcast, nor as written by iTunes. It does read Lyrics appropriately. It reads "Subtitle" as "Description".

    // Fission doesn't read the comment or lyrics from Outcast either. What it writes to those fields is recognized by Yate, MediaInfo, Kid3 AND Itunes.
    // that is just... really really annoying.

NCrusher74 commented 4 years ago

Well. It appears the problem may be that OutcastID3 doesn't write to Comment and Lyrics properly.

I think it may be time for me to cut bait where OutcastID3 is concerned and just resign myself to the fact that I NEED to create these frames for ID3TagEditor. Honestly, if I could figure out how the parsing and adapters work, I'd probably be fine. For example:

This is the contents of ID3SubtitleFrameCreator, which is a text-based frame type I created by cutting and pasting all the relevant code from the other purely text-based frame types:

class ID3SubtitleFrameCreator: ID3StringFrameCreator {
    override func createFrames(id3Tag: ID3Tag, tag: [UInt8]) -> [UInt8] {
        if let subtitleFrame = id3Tag.frames[.Subtitle] as? ID3FrameWithStringContent {
            return createFrameUsing(frameType: .Subtitle, content: subtitleFrame.content, id3Tag: id3Tag, andAddItTo: tag)
        }
        return super.createFrames(id3Tag: id3Tag, tag: tag)
    }
}

Here is the ID3FrameWithStringContent class:

public class ID3FrameWithStringContent: ID3Frame {
    /// The content as string.
    public let content: String

    /**
     Init an ID3 frame with string content.

     - parameter content: the content of the ID3 frame.
     */
    public init(content: String) {
        self.content = content
    }
}

Here is the contents of ID3SubtitleFrameContentParsingOperationFactory:

class ID3SubtitleFrameContentParsingOperationFactory {
    static func make() -> ID3FrameStringContentParsingOperation {
        return ID3FrameStringContentParsingOperationFactory.make() { (content: String) in
            return (.Subtitle, ID3FrameWithStringContent(content: content))
        }
    }
}

Here's ID3FrameStringContentParsingOperation:

typealias createFrameOperation = (String) -> ((FrameName, ID3Frame))

class ID3FrameStringContentParsingOperation: FrameContentParsingOperation {
    private var stringContentParser: ID3FrameStringContentParser
    private var createFrameOperation: createFrameOperation

    init(stringContentParser: ID3FrameStringContentParser,
         assignToTagOperation: @escaping createFrameOperation) {
        self.stringContentParser = stringContentParser
        self.createFrameOperation = assignToTagOperation
    }

    func parse(frame: Data, version: ID3Version, completed: (FrameName, ID3Frame) -> ()) {
        if let frameContent = stringContentParser.parse(frame: frame, version: version) {
            let frameNameAndFrame = createFrameOperation(frameContent)
            completed(frameNameAndFrame.0, frameNameAndFrame.1)
        }
    }
}

and here's ID3FrameStringContentParsingOperationFactory:

import Foundation

class ID3FrameStringContentParsingOperationFactory {
    static func make(operation: @escaping createFrameOperation) -> ID3FrameStringContentParsingOperation {
        let stringContentParser = ID3FrameStringContentParserFactory.make()
        return ID3FrameStringContentParsingOperation(stringContentParser: stringContentParser,
                                                     assignToTagOperation: operation)
    }
}

and the ID3FrameStringContentParserFactory, (which somehow inexplicably differs from the ID3FrameStringContentParsingOperationFactory)

class ID3FrameStringContentParserFactory {
    static func make() -> ID3FrameStringContentParser {
        let id3FrameConfiguration = ID3FrameConfiguration()
        let paddingRemover = PaddingRemoverUsingTrimming()
        let stringEncodingDetector = ID3FrameStringEncodingDetector(
            id3FrameConfiguration: id3FrameConfiguration,
            id3StringEncodingConverter: ID3StringEncodingConverter()
        )
        let stringContentParser = ID3FrameStringContentParser(
            stringEncodingDetector: stringEncodingDetector,
            paddingRemover: paddingRemover,
            id3FrameConfiguration: id3FrameConfiguration
        )
        return stringContentParser
    }
}

And the `ID3FrameStringContentParser:

class ID3FrameStringContentParser {
    private let stringEncodingDetector: ID3FrameStringEncodingDetector
    private let paddingRemover: PaddingRemover
    private let id3FrameConfiguration: ID3FrameConfiguration

    init(stringEncodingDetector: ID3FrameStringEncodingDetector,
         paddingRemover: PaddingRemover,
         id3FrameConfiguration: ID3FrameConfiguration) {
        self.stringEncodingDetector = stringEncodingDetector
        self.paddingRemover = paddingRemover
        self.id3FrameConfiguration = id3FrameConfiguration
    }

    func parse(frame: Data, version: ID3Version) -> String? {
        let headerSize = id3FrameConfiguration.headerSizeFor(version: version)
        let frameContentRangeStart = headerSize + id3FrameConfiguration.encodingSize()
        let frameContent = frame.subdata(in: frameContentRangeStart..<frame.count)
        let encoding = stringEncodingDetector.detect(frame: frame, version: version)
        if let frameContentAsString = String(data: frameContent, encoding: encoding) {
            return paddingRemover.removeFrom(string: frameContentAsString)
        } else {
            return nil
        }
    }
}

That's all for a single frame with one parameter, the String that is input. Most of that is reused by all the frames with string content, and some of it I even managed to get away with using when I created my ID3FrameWithIntegerContent type.

The Comment and Lyrics frames are a bit more complicated, because they have three parameters: The language code string (which should ultimately be invisible), the content description string, which should be terminated, and the actual content string, which can be formatted with new lines in it.

I have created ID3FrameCommentLyrics to serve the purpose that ID3FrameWithStringContent serves for single-string frames:

/**
 A class used to represent an ID3 comment or lyrics frame to be used in the ID3 tag.
 */
public class ID3FrameCommentLyrics: ID3Frame, Equatable, CustomDebugStringConvertible {

    /// The ISO-639-2 three-letter language identifier
    private static let locale = NSLocale.autoupdatingCurrent
    public var language: String?
    /// A short description of the frame content.
    public var contentDescription: String?
    /// the content of the frame
    public var contentText: String
    /// ID3FrameCommentLyrics description, useful for debug.
    public var debugDescription: String {
        return "\(String(describing: language)) - \(String(describing: contentDescription)) - \(String(describing: contentText)))"
    }

    /**
     Init a ID3 comment or lyrics frame.

     - parameter language: the ISO-639-2 language code.
     - parameter contentDescription: a terminated text string describing the frame content
     - parameter contentText: the full text of the comment or lyric frame.
     */
    public init(language: String?, contentDescription: String?, contentText: String) {
        self.language = ID3FrameCommentLyrics.locale.localizedString(forLanguageCode: language ?? "English")
        self.contentDescription = contentDescription
        self.contentText = contentText
    }

    /**
     Compare two Comment or Language frames.

     - parameter lhs: left side of compare operation.
     - parameter rhs: right side of compare operation.

     - returns: true if the language and content description values are the same, else false.
     */
    public static func ==(lhs: ID3FrameCommentLyrics, rhs: ID3FrameCommentLyrics) -> Bool {
        return lhs.contentDescription == rhs.contentDescription && lhs.language == rhs.language
    }
}

And then I have ID3CommentFrameCreator (and ID3LyricsFrameCreator, which is effectively identical)

class ID3CommentFrameCreator: ID3FrameCreatorsChain {
    private let frameCreator: FrameFromStringContentCreator
    private var id3FrameConfiguration: ID3FrameConfiguration

    init(frameCreator: FrameFromStringContentCreator, id3FrameConfiguration: ID3FrameConfiguration) {
        self.frameCreator = frameCreator
        self.id3FrameConfiguration = id3FrameConfiguration
    }

    override func createFrames(id3Tag: ID3Tag, tag: [UInt8]) -> [UInt8] {
        if let commentFrame = id3Tag.frames[.Comment] as? ID3FrameCommentLyrics {
            let newTag = tag +
                frameCreator.createFrame(
                    frameIdentifier: id3FrameConfiguration.identifierFor(
                        frameType: .Comment,
                        version: id3Tag.properties.version
                    ),
                    version: id3Tag.properties.version,
                    content: adapt(comment: commentFrame)
            )
            return super.createFrames(id3Tag: id3Tag, tag: newTag)
        }
        return super.createFrames(id3Tag: id3Tag, tag: tag)
    }

    private func adapt(comment: ID3FrameCommentLyrics) -> String {
        var commentString = ""
        if let commentLanguage = comment.language {
            commentString = "(\(commentLanguage))"
        }
        if let commentDescription = comment.contentDescription {
            commentString = commentString + "\(commentDescription)"
        }
        commentString = commentString + "\(comment.contentText)"
        return commentString
    }
}

I don't think I'm doing that adapt function properly, because I don't think those three strings should be concatenated, but the only model I have to build around is the Genre frame, which concatenates the Genre.identifier (if used) and the Genre.description string.)

Then we get to the parsing and it all falls apart because I honestly don't understand the relationship between all the parsers and parsing operation and parsing operation factory types, much less understand how to adapt it to suit a frame type whose content isn't all visible in the end.

class ID3CommentFrameContentParsingOperationFactory {
    static func make() -> ID3FrameStringContentParsingOperation {
        return ID3FrameStringContentParsingOperationFactory.make() { (language: String?, contentDescription: String, contentText: String) in
            #warning("I don't know what to do for an adaptor here, if anything")
            return (.Comment, ID3FrameCommentLyrics(language: <#T##String?#>, contentDescription: <#T##String?#>, contentText: <#T##String#>))
        }
    }
}

which is where I hit a wall because at ID3FrameStringContentParsingOperationFactory.make() { I get an error "Contextual closure type '(String) -> ((FrameName, ID3Frame))' expects 1 argument, but 3 were used in closure body".

And I just don't know how to create that whole parsing cluster in such a way as to work for a frame type with three arguments. (and if I ever get around to making TOC and Chapter frames, there will be something like 5-6 arguments, some of which are timestamps or Double.)

So I don't know what to do. But I'm spending an awful lot of time trying to fix OutcastID3 and maybe I just need to accept that it's not a good tool to use.

NCrusher74 commented 4 years ago

(for some reason I can't invite you to review that branch like I can with my other repos, but here's the link if you want to see if: https://github.com/NCrusher74/ID3TagEditor/pull/1)

SDGGiesbrecht commented 4 years ago

That is a lot of types for a simple task.

I looked at it for a few minutes, but I couldn’t really tell where to start either. I would suggest asking in an issue at the upstream repository. The original author ought to be able to point you in the right direction better than I.

If you don’t get an answer, your best bet is to compare a more complicated tag. It sounds like you started from one that was a simple String. Look for one that has more pieces, as that should be able to serve as an example of how to put more of your own pieces together.

NCrusher74 commented 4 years ago

Isn't it? Frankly I am really proud of myself that I got far enough to add 25 additional tags, including a new frame type. When I first looked at the code, I was like "NOPE!" but having it all in so many separate files actually helped. It's a well organized project, all told, even if it does seem to take five different types to do a single thing. The test module is sort of amazing.

The original author was thrilled that I contributed and merged whichever of the new types actually met his mission of keeping the project compliant with ID3 standards. He also kept the UnsyncedLyrics frame even though it wasn't compliant because a few months ago he'd had a feature request to add that one. He said he'd get around to making it compliant when he had the time, and he should know I'm working on it because I tried a pull request where I basically said "I've got you about 75% of the way there on these frames, but I need guidance with the remaining 25%." But it's been about a week and I hadn't heard back. His URL has a .it domain and things are pretty shut down in Italy right now, so that could be part of it?

The most complex type in there is the Genre type, which has two possible parameters. There's a list of officially recognized genres (which is the identifier parameter for which he has an enum) and then a description string. Either one is optional. For instance, in my implementation, because audiobook genres have nothing to do with the list of recognized music genres, I pass genre.description on as the string for my genre type, and just use nil for the genre.identifier parameter. Someone using ID3TagEditor for music could use both and concatenate them, or just use one or the other.

Unfortunately, it just doesn't actually seem to be something that easily translates to a frame with multiple string parameters, at least one of which shouldn't even be seen.

This is what ID3GenreContentParsingOperationFactory looks like:

class ID3GenreFrameContentParsingOperationFactory {
    static func make() -> ID3FrameStringContentParsingOperation {
        return ID3FrameStringContentParsingOperationFactory.make() { (content: String) in
            return (.Genre, ID3GenreStringAdapter().adapt(genre: content))
        }
    }
}

which looks pretty simple. It's actually the string adapter that's complicated, because it involves parsing the identifer and/or the description and concatenating them if appropriate.

class ID3GenreStringAdapter {
    func adapt(genre: String) -> ID3FrameGenre {
        let expression = try! NSRegularExpression(pattern: "(\\()\\w*\\d*(\\))")
        guard let genreWithParenthesisRange = Range(
                expression.rangeOfFirstMatch(in: genre, options: [], range: NSMakeRange(0, genre.count)), in: genre
        ) else {
            return ID3FrameGenre(genre: nil, description: genre)
        }
        let genreWithParenthesis = String(genre[genreWithParenthesisRange])
        let genreIdentifier = adaptGenreIdentifierFrom(genreWithParenthesis: genreWithParenthesis)
        let genreDescription = adaptGenreDescriptionFrom(
                genreDescriptionExtracted: String(genre[genreWithParenthesisRange.upperBound..<genre.endIndex]),
                genreIdentifier: genreIdentifier,
                genreWithParenthesis: genreWithParenthesis
        )
        return ID3FrameGenre(genre: genreIdentifier, description: genreDescription)
    }

    private func adaptGenreDescriptionFrom(genreDescriptionExtracted: String,
                                           genreIdentifier: ID3Genre?,
                                           genreWithParenthesis: String) -> String? {
        var genreDescription: String? = genreDescriptionExtracted
        if let validGenreDescription = genreDescription, validGenreDescription.isEmpty {
            genreDescription = nil
        }
        if notAValid(genreIdentifier: genreIdentifier, from: genreWithParenthesis) {
            genreDescription = genreWithParenthesis + (genreDescription ?? "")
        }
        return genreDescription
    }

    private func notAValid(genreIdentifier: ID3Genre?, from genreWithParenthesis: String) -> Bool {
        return genreIdentifier == nil && !genreWithParenthesis.isEmpty
    }

    private func adaptGenreIdentifierFrom(genreWithParenthesis: String) -> ID3Genre? {
        let genreIdentifierStartIndex = genreWithParenthesis.index(after: genreWithParenthesis.startIndex)
        let genreIdentifierEndIndex = genreWithParenthesis.index(before: genreWithParenthesis.endIndex)
        let genreIdentifierRange = genreIdentifierStartIndex..<genreIdentifierEndIndex
        let genreWithoutParenthesis = genreWithParenthesis[genreIdentifierRange]
        if let genreIdentifier = Int(genreWithoutParenthesis),
           let validGenre = ID3Genre(rawValue: genreIdentifier) {
            return validGenre
        }
        if (genreWithoutParenthesis == "RX") {
            return .Remix
        }
        if (genreWithoutParenthesis == "CR") {
            return .Cover
        }
        return nil
    }
}

Fortunately, that's actually it for the genre-parsing and adapting types. Unfortunately, I run away in terror the moment I see NSRegularExpression since my ill-fated attempt at using RegEx when we first became acquainted.

I THINK my Comment and Lyrics types shouldn't require much by way of adaptation or parsing. They're just strings and should be able to use the existing string adapation and parsing structures. I just can't figure out how to pull that together. I'm assuming I need some equivalent of ID3StringContentParsingOperationFactory that will accept three arguments instead of just one, and possibly a whole new adapter type that will keep the language string hidden and maybe concatenate the tag description and content strings?

All the other frame types deal with integer parameters, like the different date types, or the track/disc type. They take multiple parameters, though, so maybe that is where I need to begin?

NCrusher74 commented 4 years ago

Btw, this is what I found in the FFMPEG code, in id3v2.c under Parse a comment tag. If I could translate that to Swift maybe I'd have a good beginning?

/**
 * Parse a comment tag.
 */
static void read_comment(AVFormatContext *s, AVIOContext *pb, int taglen,
                      AVDictionary **metadata)
{
    const char *key = "comment";
    uint8_t *dst;
    int encoding, dict_flags = AV_DICT_DONT_OVERWRITE | AV_DICT_DONT_STRDUP_VAL;
    av_unused int language;

    if (taglen < 4)
        return;

    encoding = avio_r8(pb);
    language = avio_rl24(pb);
    taglen -= 4;

    if (decode_str(s, pb, encoding, &dst, &taglen) < 0) {
        av_log(s, AV_LOG_ERROR, "Error reading comment frame, skipped\n");
        return;
    }

    if (dst && !*dst)
        av_freep(&dst);

    if (dst) {
        key = (const char *) dst;
        dict_flags |= AV_DICT_DONT_STRDUP_KEY;
    }

    if (decode_str(s, pb, encoding, &dst, &taglen) < 0) {
        av_log(s, AV_LOG_ERROR, "Error reading comment frame, skipped\n");
        if (dict_flags & AV_DICT_DONT_STRDUP_KEY)
            av_freep((void*)&key);
        return;
    }

    if (dst)
        av_dict_set(metadata, key, (const char *) dst, dict_flags);
}

SDGGiesbrecht commented 4 years ago

I have a feeling you know your way around the project much better than me by now, but my guess is you need a kind of FrameContentParsingOperation, such as the other ones in this directory. The picture one does some data dissection like you’ll have to.

NCrusher74 commented 4 years ago

I just realized something. I've finally got CocoaPods working. Which means TagLibiOS is also an option, so I may have at least once other thing I can try before I muddle through creating the frames I need in ID3TagEditor.

NCrusher74 commented 4 years ago

Hrm. Maybe. I need to figure out if there is a way to make it support MacOS, because apparently that isn't supported right now. I also need to figure out if I can use a fork with the CocoaPod instead, because this fork has added some nice features that will probably save me a lot of headaches.

NCrusher74 commented 4 years ago

I'm having an issue with Cocoapods. Again.

I made a fork of that repo so that I could edit the .podspec file so that it would support MacOS rather than just iOS (or at least to see if it was possible) but it's not working so well for me.

I'm following these instructions to make a private spec repo, but when I get to the point of running the pod repo lint . command, I get this error:

Crystals-MacBook-Pro:TagLibIOS nolainecrusher$ pod repo lint .

Linting spec repo `TagLibIOS`

[!] An unexpected version directory `Tests` was encountered for the `/Users/nolainecrusher/.cocoapods/repos/TagLibIOS/Example` Pod in the `Example` repository.

And from there it's pretty much hopeless. The repo doesn't validate and adding it to my podfile as a source doesn't work.

Any idea what I'm doing wrong?

SDGGiesbrecht commented 4 years ago

I have basically zero experience with CocoaPods. Years ago I toyed with it, but some of its design decisions didn’t seem quite right to me, so I chose other tools instead. I haven’t had a reason to touch it since. So I’m afraid I can’t be much help.

NCrusher74 commented 4 years ago

That's fine. Looking closer, it looks like maybe the people who made the wrapper only did so for the handful of "universal" tags, not all the tags that are available. Which means I'd once again be stuck in a place of not being able to use it for what I need without major modifications that I don't have the qualifications to make.

I could have sworn a month or two ago, I found some pre-compiled TabLib binaries but now I can't locate them to see if they'd be what I need.

NCrusher74 commented 4 years ago

Possibly a silly question.

If there were an app that had all the tags I need to include, would it be possible to use their source code as a framework (I checked their license.) https://github.com/KDE/kid3/tree/master/src

KID3 is a pretty comprehensive tagger. It includes all the stuff in need for ID3 tagging, including the CTOC and CHAP frames. They have a CLI that I've never been able to figure out (though admittedly, I haven't tried since I started working with tags enough that things like what a frame is now make sense to me) which is why I sort of went the route I did with trying to create my own tagging framework.

I spent last night and today looking for a framework that would have everything I need, and the few that do seem to have it don't seem to be usable (I can't build them because there's a file missing or something.) I probably would have been better off spending the time figuring out the parser stuff for ID3TagEditor but I'm just really doubting my abilities there.

SDGGiesbrecht commented 4 years ago

would it be possible to use their source code as a framework (I checked their license.)

Probably. But you cannot just drag‐and‐drop it. You’d need it to build as a module, not an application. So you’d need to tell either SwiftPM or Xcode where to find all the source files (and any extra flags necessary) so that the same source files produce a module separate from whatever the original author defined.

NCrusher74 commented 4 years ago

Okay. It's largely in C++ but it doesn't have an XCode project file or anything.

I assume I just need their src directory, not the directories related to the various platform-specific implementations, correct?

I'd start my own XCode project as a framework, correct? And then if not drag and drop, how would I tell it what to do with the source files? (also, if I understand the research I did correctly, I need an Objective-C intermediary between C++ and Swift, right? So my project should be Objective-C?)

SDGGiesbrecht commented 4 years ago

Unless it has transitive dependencies that don’t have their own sources available, SwiftPM will be easier. You’ll just have to specify more paths than normal, because the repository is arranged in a non‐standard way. (See the manifest documentation here.) (The inherited dependencies were the problem with whatever the project was I started to do that for at one point.)

NCrusher74 commented 4 years ago

Yeah I've looked through their /src/core directory and I don't see any dependencies. (MP42Foundation has some contributor-submitted stuff as a submodule, but I don't see any of that in KID3.)

I'll see what I can figure out with SPM.

NCrusher74 commented 4 years ago

So, as an update, you may have noticed the past couple days I have been experimenting with various things. I'm still hoping the author of ID3TagEditor will get back to me and give me some guidance on how to write the parsers and adapters for the complex tags I need (I feel like I'm almost there, but I'm stuck at what is required for a terminated string, of all the silly things.)

That would probably be the quickest and easier route, considering how much I've already got invested in that direction.

As a fallback, however, I decided to cast a wider net. I noticed there were a lot of Python modules that do what I need to do, but because they were Python, I didn't think they were an option. Then I uncovered a discussion on the Swift forums (to which you had replied) last year about something called PythonKit, so I'm currently experimenting with that. There's a Python module called Mutagen and the license looks like I could use it for my purposes. I just need to figure out how? It has everything I need, however, so here's hoping.

NCrusher74 commented 4 years ago

I'm going to close this PR and the Chaptering PR, because it looks like I'm going to have to find another solution. I've hit the end of where my current abilities will let me adapt the ID3TagEditor code, and I've been informed by the author that he doesn't have the time to help me finish the process of adding the frames I'm going to need.

The author of OutcastID3 doesn't appear to be responsive and the chances of my adapting that code to work may be slimmer than the chances of my adapting ID3TagEditor.

I'm really quite stuck.

NCrusher74 / AudiobookTagger

Multi stage id3 writing #10