Multicore preprocessing solution for pytorch

RomanArzumanyan / VALI

Video processing in Python

Apache License 2.0

21 stars 1 forks source link

Multicore preprocessing solution for pytorch #45

Open johndpope opened 3 weeks ago

johndpope commented 3 weeks ago

I’m looking at this framework from performance perspective- I want to use it to quickly preprocess videos en masse.

I am happy to help build a wrapper - I would need it to match record to make it easier to consume. I really want to achieve this though

https://github.com/johndpope/MegaPortrait-hack/issues/38

To get best performance - I’m prepare to cut code in c++ to do this. Or is there other things that come to mind ?

RomanArzumanyan commented 3 weeks ago

Hi @johndpope

From your description I got some overall understanding of what you are trying to achieve. What VALI features are you interested in WRT your project ?

I’m looking at this framework from performance perspective

It does natively support:

Multi GPU
Video seek, separate decode and demux
Zero copy memory sharing with torch cuda tensors
Simple image processing routines like resize and color conversion

johndpope commented 3 weeks ago

i had a play with multithreading - on cpu - but to get images done - looks like can't get away from cpu. https://github.com/johndpope/MegaPortrait-hack/issues/38

i think the problem is pilimage has never had gpu support....

i use the streamreader from torchvision - somewhat successful to cycle through frames.

            streamer = StreamReader(src=video_path)

            # - ``"rgb24"``: 8 bits * 3 channels (R, G, B)
            # - ``"bgr24"``: 8 bits * 3 channels (B, G, R)
            # - ``"yuv420p"``: 8 bits * 3 channels (Y, U, V)
            # - ``"gray"``: 8 bits * 1 channels

            streamer.add_basic_video_stream(
                frames_per_chunk=16000,
                frame_rate=25,
                width=self.width,
                height=self.height,
                format="rgb24"
            )

i appreciate your timely response - i guess i continue down this streamer path for now. but you're saying i can do gpu + images? maybe i can extend your work. off to karate training now. i take another look later.

RomanArzumanyan commented 3 weeks ago

@johndpope

If you just want to get decoded video frames and run your inference on them as a first step, you may follow the code from torch segmentation test:

https://github.com/RomanArzumanyan/VALI/blob/820caeae824199bd01b9649aa02095137e2df780/tests/test_TorchSegmentation.py#L178

It keeps everything on GPU. Yet there's a room for perf optimization which wasn't done because it's a unit test, not perf test. E. g. decode video frames in batches, not one by one. Anyway, it may be a good starting point.

RomanArzumanyan commented 2 weeks ago

@johndpope

Hi. Please LMK if your issue is resolved.

johndpope commented 2 weeks ago

Hi @RomanArzumanyan - thanks - i circle back when i have more bandwidth. closing for now. this kinda wrapper thing - i had a crack at cupy - https://github.com/johndpope/MegaPortrait-hack/blob/feat/38-multicore/EmoDataset.py#L251 wasn't successful - but i take another look with this.

processing a video with gpu - and then augmenting etc - it's very valuable -

please consider adding some helper wrappers to allow working with the frames. I'll take a stab when i have bandwidth.

tensor_frame, image_frame = self.augmentation(frame, self.pixel_transform, state)

        if self.apply_crop_warping:
            transform = transforms.Compose([
                transforms.Resize((self.width, self.height)),
                transforms.ToTensor(),