keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
991 stars 326 forks source link

take over video swin checkpoints #2448

Closed innat closed 18 hours ago

innat commented 3 months ago

Keras-team

Could you please take over the video swin checkponts and upload it to kaggle in order to make it usable in kaggle platform?

  1. I have posted regarding weights. https://github.com/keras-team/keras-cv/pull/2369#issuecomment-1989420576
  2. Weight verification. https://github.com/keras-team/keras-cv/pull/2369#issuecomment-2028827900
  3. Also could you please answer this query? https://github.com/keras-team/keras-cv/pull/2369#discussion_r1545657237
divyashreepathihalli commented 2 months ago

Hi Innat!! will do Thanks!!

divyashreepathihalli commented 2 months ago

@innat the verification notebooks are not accessible. can you please double check the permissions?

innat commented 2 months ago

@divyashreepathihalli

what is the difference between the base and the backbone weights? do you have weights for the video classifier task model that you added?

Ignore else, you can only look at here.

- videoswin_tiny_kinetics400.weights.h5 <- for backbone
- videoswin_tiny_kinetics400_classifier.weights.h5 <- for classifier

- videoswin_small_kinetics400.weights.h5 <- for backbone
- videoswin_small_kinetics400_classifier.weights.h5 <- for classifier

- videoswin_base_kinetics400.weights.h5 <- for backbone
- videoswin_base_kinetics400_classifier.weights.h5 <- for classifier

Some other variations of this model. (also check this comment)

- videoswin_base_kinetics400_imagenet22k.weights.h5 <- for backbone
- videoswin_base_kinetics400_imagenet22k_classifier.weights.h5 <- for classifier

- videoswin_base_kinetics600_imagenet22k.weights.h5 <- for backbone
- videoswin_base_kinetics600_imagenet22k_classifier.weights.h5 <- for classifier

- videoswin_base_something_something_v2.weights.h5 <- for backbone
- videoswin_base_something_something_v2_classifier.weights.h5 <- for classifier

FYI, the tiny, small, base are refer to the variants of the model. The kinetics400 and kinetics600 refer the kinetics dataset with 400 and 600 classes. The _imagenet22k term refers the pre-trained weight which were incorporated from 2D swin image model to initialize the video swin model. You can also check the official repo for more clarification.

Also can you please use the keras_model.save_to_preset(checkpoint_name) to save to preset, this will generate a config, metadata and weights file that we can then upload to Kaggle

I see, new API. Let me check.

Can you also please add your checkpoint conversion file here - keras_cv/src/tools/checkpoint_conversion.

I have to clean lots of messy code. How about the following two files

divyashreepathihalli commented 2 months ago

I will wait for your generated preset files. Also I cannot still access the verification files.

innat commented 2 months ago

@divyashreepathihalli Do you have a kaggle id? If so, could you please share?

divyashreepathihalli commented 2 months ago

@innat what do you mean by Kaggle ID?

innat commented 2 months ago

@divyashreepathihalli Uh, sorry for the confusion. Actually, I prepared some kaggle notebooks (currently its private) which will help you to evaluate or verify the model and weights. You can also run them out of the box on kaggle env (plug-n-play). If you share your kaggle id, I can add you as collaborator.

Here are the notebooks on kaggle. (currently in private.). I don't like to make these notebook public (as I already did something before with my own code). The following notebooks load the keras-cv.video-swin from its latest release. After the take over process (checking, verifying, saving of presets or weights) is done (by you), I will remove these notebooks from my end. Hope its clear now.

  1. k400-logit-matching-torch-vs-keras-cv (classifier)
  2. k400-logit-matching-torch-vs-keras-cv-backbone
  3. k600-ssv2-logit-matching-torch-vs-keras-cv (classifier)
  4. k600-ssv2-logit-matching-torch-vs-keras-cv-backbon
  5. generate presets for keras-cv with save_to_preset
  6. All the above notebooks, wget the model weights from here.

(Note. In number 1 and 2, we load the official video swin model from torchvision (with their API), and in number 3 and 4, we load the official vidoe swin model in raw pytorch code. )

divyashreepathihalli commented 2 months ago

Hi Innat!! here is my kaggle id - divyasss.

divyashreepathihalli commented 2 months ago

Also here is the process for presets

innat commented 2 months ago

@divyashreepathihalli

and update the kaggle handle to be "kaggle_handle": "kaggle://keras//keras//<version_number"

I prefer not to do that coz it will take much time to upload all the weights to kaggle. However, is it possible to test the weight with local file path? Also, if I want to manually load the preset file, what are the loading APIs? For example, to load the .weight.h5, that is load_weight?



!ls video-swin-presets/videoswin_base_kinetics400
- config.json
- metadata.json
- model.weights.h5

def vswin_tiny():
    backbone=VideoSwinBackbone(
        input_shape=(32, 224, 224, 3), 
        embed_dim=96,
        depths=[2, 2, 6, 2],
        num_heads=[3, 6, 12, 24],
        include_rescaling=False, 
    )
    keras_model = VideoClassifier(
        backbone=backbone,
        num_classes=400,
        activation=None,
        pooling='avg',
    )

    # option 1
    keras_model.load_weights(
        'video-swin-presets/videoswin_tiny_kinetics400'
    )

   # option 2
   keras_model.load_presets(
        'video-swin-presets/videoswin_tiny_kinetics400'
    )
    return keras_model
divyashreepathihalli commented 1 month ago

loading the preset is done using model.from_preset("preset_name or kaggle uri")

github-actions[bot] commented 2 weeks ago

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 18 hours ago

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.