androidx / media

Jetpack Media3 support libraries for media use cases, including ExoPlayer, an extensible media player for Android
https://developer.android.com/media/media3
Apache License 2.0
1.56k stars 373 forks source link

Media3 Transformer Concatenation Issue #1658

Open avinash--reddy opened 2 weeks ago

avinash--reddy commented 2 weeks ago

Version

Media3 main branch

More version details

Library Version: androidx.media3:media3-transformer:1.4.1 Android Version: Android 13 and other emulators. Device Model: Tested in Moto Edge 40 and pixel emulators

Devices that reproduce the issue

Device Model: Tested in Moto Edge 40 and pixel emulators

Devices that do not reproduce the issue

NA

Reproducible in the demo app?

Yes

Reproduction steps

Steps to Reproduce:

Create three video media items, each with different effects (e.g., zoom, rotate). Export each media item with its respective effect. Create a sequence of these media items using EditedMediaItemSequence. Build a Composition with the sequence. Start the transformation process to concatenate the videos into one output file. Inspect the output file.

private fun animateAndExport(context: Context, fileUri1: Uri, fileUri2: Uri, fileUri3: Uri) { val mediaItem1 = MediaItem.Builder().setUri(fileUri1).build() val mediaItem2 = MediaItem.Builder().setUri(fileUri2).build() val mediaItem3 = MediaItem.Builder().setUri(fileUri3).build()

val editedMediaItem1 = EditedMediaItem.Builder(mediaItem1)
    .setEffects(Effects(listOf(), listOf(createZoomEffect(2000, 0.5f, 1f))))
    .build()

val editedMediaItem2 = EditedMediaItem.Builder(mediaItem2)
    .setEffects(Effects(listOf(), listOf(createRotateEffect(2000))))
    .build()

val editedMediaItem3 = EditedMediaItem.Builder(mediaItem3)
    .setEffects(Effects(listOf(), listOf(createZoomEffect(2000, 1f, 0.5f))))
    .build()

val videoSequence = EditedMediaItemSequence(editedMediaItem1, editedMediaItem2, editedMediaItem3)
val composition = Composition.Builder(videoSequence).build()

val outputFile = File(context.getExternalFilesDir(Environment.DIRECTORY_MOVIES), "combined_video.mp4")
val transformer = Transformer.Builder(context)
    .setVideoMimeType(MimeTypes.VIDEO_H265)
    .setAudioMimeType(MimeTypes.AUDIO_AAC)
    .build()

transformer.start(composition, outputFile.absolutePath)

}


If i add as val videoSequence = EditedMediaItemSequence(editedMediaItem1, editedMediaItem2, editedMediaItem3) then 1st item effects will get applied, but not 2nd and 3rd item.

If i add as val videoSequence = EditedMediaItemSequence(editedMediaItem2, editedMediaItem1, editedMediaItem3) then 2nd item effects will get applied, but not 1st and 3rd item.

Basically to summarise, whatever is the 1st item , those effects are applied but not for the other items.

Able to repro in the demo app, by adding multiple media items at https://github.com/androidx/media/blob/main/demos/transformer/src/main/java/androidx/media3/demo/transformer/TransformerActivity.java#L359. createComposition() method.

Expected result

All applied effects should be visible in the final concatenated video, reflecting the effects specified for each media item in the sequence.

Actual result

Only the effects from the first media item are applied in the final output video. Effects from subsequent media items are not visible.

Media

https://github.com/user-attachments/assets/789b8f81-c03d-4700-b8cb-c5ce8e8eb8cc

Bug Report

droid-girl commented 2 weeks ago

@avinash--reddy just to clarify, the attached animated.mp4 is the input video you use? When you write that the effects are not visible, do you mean that the 2 other videos are part of the export but no effects are applied?

avinash--reddy commented 2 weeks ago

@droid-girl Yes, Attached one is a 9 second video . But actually it was 3 second video each and transformed to 9 second one by adding it to the sequence using 3 different animations. (Added the code in steps to reproduce section)

 val videoSequence = EditedMediaItemSequence(editedMediaItem1, editedMediaItem2, editedMediaItem3)
val composition = Composition.Builder(videoSequence).build()

FYI, the result is same even if I try with different 3 videos. It applies animation only for first video in the sequence.

avinash--reddy commented 2 weeks ago

https://github.com/user-attachments/assets/1422ebf1-60e8-4fa7-b6ff-6f7b683122ea

Attaching this video with 3 different files for more clarity.

avinash--reddy commented 2 weeks ago

Updated the video in the issue too with 3 diff files for more clarity.

droid-girl commented 2 weeks ago

Thank you for reporting, we will look into it.

SheenaChhabra commented 2 weeks ago

@avinash--reddy Can you please add a snippet of your method createZoomEffect() and createRotateEffect()? It will help us understand the issue.

avinash--reddy commented 1 week ago

Sure. Here is the code snippet. @SheenaChhabra

private fun createZoomEffect(
        durationMs: Long,
        scaleStart: Float,
        scaleEnd: Float
    ): MatrixTransformation {
        return MatrixTransformation { presentationTimeUs ->
            val elapsedTimeMs = presentationTimeUs / 1_000f // Convert microseconds to milliseconds

            // Ensure the effect only applies within the duration window
            if (elapsedTimeMs > durationMs) {
                return@MatrixTransformation Matrix() // Return identity matrix if beyond the duration
            }

            // Calculate the zoom scale based on elapsed time
            val scaleStart = scaleStart // Start scaling at 50%
            val scaleEnd = scaleEnd // End scaling at 100%
            val scale = scaleStart + (scaleEnd - scaleStart) * (elapsedTimeMs / durationMs)

            val transformationMatrix = Matrix()
            transformationMatrix.postScale(scale, scale)
            transformationMatrix
        }
    }

    private fun createRotateEffect(durationMs: Long): MatrixTransformation {
        return MatrixTransformation { presentationTimeUs ->
            val elapsedTimeMs = presentationTimeUs / 1_000f // Convert microseconds to milliseconds

            // Ensure the effect only applies within the duration window
            if (elapsedTimeMs > durationMs) {
                return@MatrixTransformation Matrix() // Return identity matrix if beyond the duration
            }

            // Calculate the rotation angle based on elapsed time
            val maxRotation = 360f // Maximum rotation angle
            val rotation = (maxRotation * (elapsedTimeMs / durationMs)) % maxRotation

            val transformationMatrix = Matrix()
            transformationMatrix.postRotate(rotation, /* px */ 0f, /* py */ 0f)
            transformationMatrix
        }
    }
avinash--reddy commented 1 week ago

I Understood that the issue is due to the presentationTimeinUS. We have to update this val elapsedTimeMs = presentationTimeUs / 1_000f logic based on the video start time .

For example . My first video is 3 seconds. So for second video which has rotate effect I have to update my logic as below. (presentationTimeUs - 3000000000000) / 1_000f

private fun createRotateEffect(durationMs: Long): MatrixTransformation {
        return MatrixTransformation { presentationTimeUs ->
            val elapsedTimeMs =
                (presentationTimeUs - **3000000000000**) / 1_000f // Convert microseconds to milliseconds
            Log.d(
                "RotateEffect",
                "presentationTimeUs: $presentationTimeUs, elapsedTimeMs: $elapsedTimeMs"
            )
            // Ensure the effect only applies within the duration window
            if (elapsedTimeMs > durationMs) {
                return@MatrixTransformation Matrix() // Return identity matrix if beyond the duration
            }

            // Calculate the rotation angle based on elapsed time
            val maxRotation = 360f // Maximum rotation angle
            val rotation = (maxRotation * (elapsedTimeMs / durationMs)) % maxRotation

            val transformationMatrix = Matrix()
            transformationMatrix.postRotate(rotation, /* px */ 0f, /* py */ 0f)
            transformationMatrix
        }
    }

Please let me know if this is a right fix or if any better fix available for the same. Thank you.

ychaparov commented 1 week ago

Yes, subtracting an offset from presentationTimeUs makes sense.

The presentationTimeUs values are continuous across the EditedMediaItemSequence - so subsequent items' effects will see a larger value.

For a sequence EditedMediaItemSequence(editedMediaItem1, editedMediaItem2, editedMediaItem3)

To compute the offset for editedMediaItem3 you should sum up the durations of editedMediaItem1 and editedMediaItem2.

And you can get this duration programmatically by doing something similar to the code here https://github.com/androidx/media/blob/c35a9d62baec57118ea898e271ac66819399649b/libraries/transformer/src/main/java/androidx/media3/transformer/EditedMediaItem.java#L286-L303

avinash--reddy commented 1 week ago

Thanks @ychaparov. But some instances I see the presentationTimeUs value as 3000000000000 and in some instances I see it as 3000000.

ychaparov commented 1 week ago

When are you seeing 3000000000000 and when 3000000?

Is one using Transformer, and the other using CompositionPlayer?

FYI @claincly

claincly commented 1 week ago

My first video is 3 seconds. So for second video which has rotate effect I have to update my logic as below. (presentationTimeUs - 3000000000000) / 1_000f

I'm not sure how the 3000000000000 number come into play. It looks like the renderer offset that we add, but I don't see why you'd get that number - we make sure this number is not visible to effects. (It's strange because this number should not be additive, like you should not get 3000000000000 because you have three media items.)

Not sure if you have modified other parts of the player? How did you clip your media?

I did a quick hack and it seems it's working fine with our demo. You can see the frame timestamps printed on the frame, in microseconds. Tested with all images and it also worked

https://github.com/user-attachments/assets/41862330-6166-4d79-ba3a-1fd9313d3016