Closed merveenoyan closed 1 year ago
Also I was wondering the reason why you used tf.io
to handle files. Can you explain? @deep-diver
during pushing you always create a repository and you raise an error if the repository already exists.
I thought it is the easiest way. The line of code you pointed out simply tries to create a repository, and if there is an existing repo, it won't create a new one.
Basically what it does is 1. create repo (first time when the pipeline is run), 2. clone repo (after creation or if the repo already exists), 3. create a new branch w/ new checkpoint 4. push
It handles different version(checkpoint) of the same model in separate branch rather than having separate directories or overiding existing files.
Also I was wondering the reason why you used tf.io to handle files. Can you explain? @deep-diver
tf.io
handles file I/O in destination agnostic way (AFAIK, Google Cloud Storage(GCS) and Local File System for now). When using GCP services, almost all the intermediate results(Artifacts) are stored in GCS. Also, with tf.io
, it is possible to manage HF Spaces
template codes in GCS too
@deep-diver sorry I think I missed the logger logging a warning and error being omitted. This works!
Hello 🤓 I'm reading the code for HfPusher and I stumbled on this (I appreciate if you could correct me if I'm wrong): during pushing you always create a repository and you raise an error if the repository already exists. I feel like the point of a model repository is the ability to version same model with different checkpoints in different commits. I feel like it would be good to push to the repository with the name given despite the repository exists. WDYT?