tensorflow / tfx-addons

Developers helping developers. TFX-Addons is a collection of community projects to build new components, examples, libraries, and tools for TFX. The projects are organized under the auspices of the special interest group, SIG TFX-Addons. Join the group at http://goo.gle/tfx-addons-group
Apache License 2.0
125 stars 64 forks source link

Have model cards in `HFPusher` #200

Open merveenoyan opened 1 year ago

merveenoyan commented 1 year ago

HFPusher component recently introduced in this blog post and is currently in this library has model cards feature (so it's in the project repository as of now, and not in tfx-addons' HFPusher) built for Hugging Face Hub. Reason why we had to go with HF model cards was that it follows a different structure, has metadata part as yaml on top to enable easy discovery and a free text markdown section below that. So it would be good to contribute that to the HFPusher in tfx-addons. Later on, we can introduce many other things if demanded, see this issue for discussion. Also the Hugging Face Hub client library has many functions for parsing and programmatically adding and editing model cards, so it's convenient for people too! WDYT?

Also pinging @sayakpaul and @deep-diver.

sayakpaul commented 1 year ago

@rcrowe-google this would definitely be a nice feature to have.

rcrowe-google commented 1 year ago

@hanneshapke The difference between the Google Model Cards and the Hugging Face Model Cards might be something to explore in the MCT team.

hanneshapke commented 1 year ago

Wow, what a timing @rcrowe-google. We just discussed this topic in the MCT meeting this morning as part of the framework agnostic MCT.

@sayakpaul @deep-diver @merveenoyan If you can, please join in on the effort! Any input and contribution from the HF side is extremely valuable!

merveenoyan commented 1 year ago

@hanneshapke can you give me pointers to get started? Or should I just open a PR to integrate the model cards?

hanneshapke commented 1 year ago

@merveenoyan Thank you for offering your help! We are currently collecting the different model/attributes the MCT should support. @deutranium is working on a comparison. If you can, join our next MCT call. We wanted to discuss a project roadmap and decide on the scope.

merveenoyan commented 1 year ago

@hanneshapke sure it would be nice! I previously worked on model cards for Keras models, see here (currently working on cards for different libraries) so for the model cards in HFPusher I thought of adding similar things (model history, hyperparameters, architecture and more) I'd love to meet 🙂

hanneshapke commented 1 year ago

Hi @merveenoyan, From your experience with the HuggingFace model cards, do you see anything missing in the original MCT proto (https://github.com/tensorflow/model-card-toolkit/blob/main/model_card_toolkit/proto/model_card.proto)?

We would love to add it in the future versions. The plan is to support the HF cards too.

merveenoyan commented 1 year ago

Hello @hanneshapke,

Reason why I particularly would pick HF over MCT would be choosing markdown over protobuf interface. We have a python client side library that allows us to programmatically create and edit model cards and also retrieve them when the model pushed to Hugging Face Hub. These model cards consist of a yaml metadata part and markdown part, where former is used for discoverability and filtering of models on HF Hub and latter is purely informative. You can check out an example one here, where the hyperparameters and model architecture plot are added automatically and rest is filled by the users. This is essentially what I'm working on to add to HfPusher component as well. We can also add model history too, as the markdown format provides a lot of flexibility.