updating is broken - Githubissues

paboum commented 8 months ago

I have like 200 Loras installed. When I try to "Check models' new version", it is enough that one of them fails with:

requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='civitai.com', port=443): Read timed out. (read timeout=5)

and the whole update is broken and the results are lost. This should either retry connections, or present to me the partial results, or use another best effort approach. Simply failing and displaying (Error) is not enough.

The only workaround I can think of right now is moving most loras out of my setup, run the update for the partial set, and maybe it will succeed.

Another thing is that calling Civitai API 200 times just to see if anything was updated seems cumbersome. First of all, we could assume that each lora is only updated once a day - if it has been checked less than a day ago, then skip it. This would resolve my issue btw. Then, a bulk query should be possible to API, asking for the most recent versions of each lora in the set. If their API doesn't allow that (and they refuse to improve it), then caching such results would be prudent - just have a microservice query for top 5000 loras daily, and then Civitai Helper would fetch that result instead of querying Civitai - they benefit from this too as only get their API asked 5000 times daily, instead of all users asking for all loras multiple times.

Finally, as a side remark, just checking the sha hash of a model isn't the best method to identify it. I use https://github.com/arenasys/stable-diffusion-webui-model-toolkit to prune checkpoint models from unnecessary stuff that makes them easier to squeeze in my VRAM. And now if I used Civitai Helper to find their updates, it would yield errors because the hash is now different. Either Civitai Helper should somehow know what checksum of a model was before pruning, or calculate the checksum of the part that is never pruned, or use the abovementioned microservice to translate various hashes into their normalised form.

zixaphir commented 8 months ago

and the whole update is broken and the results are lost. This should either retry connections, or present to me the partial results, or use another best effort approach. Simply failing and displaying (Error) is not enough.

I apologize for this. In the last update, I set a timeout on connections that was too low. This has already been fixed on dev, along with better error messages, and it will retry a few times if it fails.

Another thing is that calling Civitai API 200 times just to see if anything was updated seems cumbersome.

I do agree that this is a little silly, but as far as I can tell from their API documentation, there is no way to query more than one model at a time in a granular way.

if it has been checked less than a day ago, then skip it.

I can work to implement this, yes.

just have a microservice query for top 5000 loras daily

I don't really have the resources to do this.

just checking the sha hash of a model isn't the best method to identify it.

The checksum is only used for fetching the initial information via scanning all models without model information. There is no way to determine the origin of a user-created model (which is what a user-pruned model is). As long as the pruned model maintains the same name or the metadata is renamed to match that of the pruned model, the model information should be preserved. But doing that automatically would require writing compatibility code for every extension that can prune models, and there would be no way to account for external tools.

paboum commented 8 months ago

I apologize for this. In the last update, I set a timeout on connections that was too low. This has already been fixed on dev, along with better error messages, and it will retry a few times if it fails.

This seems to have helped. Thanks!

I do agree that this is a little silly, but as far as I can tell from their API documentation, there is no way to query more than one model at a time in a granular way.

Not sure if this helps, but perhaps they would be interested in adapting their API to customers' needs? After all, they want us to use their API (it's working), but surely don't want their servers overloaded by suboptimal queries. I would try and contact them.

The checksum is only used for fetching the initial information via scanning all models without model information. There is no way to determine the origin of a user-created model (which is what a user-pruned model is). As long as the pruned model maintains the same name or the metadata is renamed to match that of the pruned model, the model information should be preserved. But doing that automatically would require writing compatibility code for every extension that can prune models, and there would be no way to account for external tools.

This seems to be a good reason to suggest a change in .safetensors file format (possibly others too) to include a manifest which would state the model author's signature, model version and/or generation date, perhaps also the other information Civitai Helper struggles to gather in other ways. Apparently there are hundreds of thousands of models incoming and the community needs some level of structure while installing, updating and using them. Civitai seems to be a good place to start such effort. Perhaps they are already working on something like this?

zixaphir commented 8 months ago

This seems to be a good reason to suggest a change in .safetensors file format (possibly others too) to include a manifest which would state the model author's signature, model version and/or generation date, perhaps also the other information Civitai Helper struggles to gather in other ways.

The safetensors format does allow for storing metadata in its header of an arbitrary length, but the bigger issue is convincing model authors to use it. I suppose Civitai could edit the model headers themselves, but that would also complicate identifying a model, since any changes to the file header would result in changes to the hash, which also doubles as a security feature: without the same hash, you have no way of verifying Civitai hasn't injected something dangerous into a model post-upload.

There is a usable function in webui that only hashes content after the header, but the issue with that is that there's no way to lookup a model with that hash, and that would still be changed by model pruning.

paboum commented 8 months ago

I've decided to stop pruning Loras, as only checkpoint size impacts memory significantly. Trying to restore all original Loras now.

With limited success, I was able to recollect some original models with a script based on the idea:

l=some_lora_name
curl -s https://civitai.com/api/v1/models?query="$l" > temp
cat temp | jq '.items[].modelVersions[] | { id, "name" : .files[].name, id } | join(":")' | tr -d '"' | while read s
    do
        id=`echo "$s" | cut -d ':' -f 1`
        filename=`echo "$s" | cut -d ':' -f 2`
        if [ "$l.safetensors" == "$filename" ]
            then wget -qO "$filename" https://civitai.com/api/download/models/$id
        fi
    done

This mostly fails on the filenames including version numbers and in different formats too. I am currently experimenting with various sed commands to trim various -V10, _v1.2 and similar suffixes.

Perhaps Civitai Helper could include similar heuristic for the loras that can't be found based on their hash.

Another approach would be to simply create an index, a dictionary from strings (filenames, or hashes, or both) into model id numbers. This would be an append-only data structure and could even be hardcoded in the source code. The user could then choose if they wish a fast offline query or deep and up-to-date online check.

paboum commented 8 months ago

I've managed to improve the above to recollect ~50% of my Loras, that's good for now. The key part:

        echo "$s" | sed -e "s:[-_ \.]*[vV]\?[0-9\.]\+[a-z]\?$::" -e "s:_: :g" -e "s:\([A-Z]\): \1:g" |

Other than this, I suggested that in Model Toolkit they preserve original file information so that Civitai Helper can access it (https://github.com/arenasys/stable-diffusion-webui-model-toolkit/issues/41) and that Civitai allows API filename search (https://github.com/orgs/civitai/discussions/183#discussioncomment-7257089).

zixaphir commented 8 months ago

The key part:

        echo "$s" | sed -e "s:[-_ \.]*[vV]\?[0-9\.]\+[a-z]\?$::" -e "s:_: :g" -e "s:\([A-Z]\): \1:g" |

I think I understand most of this except the last "s:\([A-Z]\): \1:g"

paboum commented 8 months ago

It adds a space before each capitalized letter. E.g SlawomirMentzen becomes Slawomir Mentzen - couldn't find it without it with this API (as it searches for the Lora title, not the filename).

Btw, I've found Civitai's private API used by the webpage using https://meilisearch-new.civitai.com/multi-search endpoint and Authentication Bearer token from my web browser session. It is possible to use it with https://github.com/lwthiker/curl-impersonate to obtain Id number for almost every filename, like:

curl_chrome116 ... --data-raw '{"queries":[{"q":"elevator_v0.4-locon-000007", "indexUid":"models_v2"}]}' > temp
cat temp | jq '.results[].hits[].id' | while read id
curl https://civitai.com/api/v1/models/$id > temp2
cat temp2 | jq '.modelVersions[] | { id, "name" : .files[].name, id } | join(":")' | ...

but I cannot recommend it for placing in Civitai Helper's code, for various reasons; a) The user would need to provide their authentication bearer token which may be too difficult. b) Meilisearch seems to be their bottleneck and they probably pay for it so they may be very unhappy if you use it outside their web front and not watch the ads. c) It's cumbersome for the very least, the mass user should probably wait until they expand the API as suggested in https://github.com/orgs/civitai/discussions/183#discussioncomment-7257089

zixaphir commented 8 months ago

Yeah, I agree that we're not going to use it in this extension. If they wanted to provide a version for us to use of that, they would have documented it

zixaphir commented 8 months ago

Alright, I've re-written your shell script into python, taking some liberties to make it a bit more general-use. I'll see about integrating it in the future:

""" download_model_by_name.py
Downloads a model using only the model's filename.
"""
import os
import time
import platform
import re
import sys
import requests
import urllib3

default_headers = {
    "User-Agent": (
        "Mozilla/5.0 (iPad; CPU OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148"
    )
}

SERVICE = "https://civitai.com/api/"

def get_url(url, retries=0, headers={}):
    urllib3.disable_warnings()
    for key, val in default_headers.items():
        headers[key] = val
    try:
        response = requests.get(
            url,
            stream=True,
            verify=False,
            headers=headers,
            timeout=100
        )
    except TimeoutError:
        print("Request timed out :(")
        return None

    if not response.ok:
        print(f"GET Request failed with {response.status_code}")
        if response.status_code == 404:
            return None
        if retries < 3:
            print("Retrying")
            return get_url(url, retries)

    print("GET Request success!")
    return response

def write_file(response, filename):
    downloaded_size = 0
    total_size = int(response.headers['Content-Length'])
    start = time.time()
    with open(filename, "wb") as dl_file:
        for chunk in response.iter_content(chunk_size=1024):
            if chunk:
                dl_file.write(chunk)
                dl_file.flush()

                # The rest of this is just a progress bar
                downloaded_size += len(chunk)
                elapsed = time.time() - start
                speed = downloaded_size // elapsed if elapsed >= 1 else downloaded_size
                # Mac reports filesizes in multiples of 1000
                unit = 1000 if platform.system() == "Darwin" else 1024

                i = 0
                while speed > unit:
                    i = i + 1
                    speed = speed / unit
                    if i >= 3:
                        break

                speed = round(speed, 2)
                multiple = ["", "K", "M", "G"][i]

                # progress
                progress = int(100 * downloaded_size / total_size)
                completed = "-" * min(progress // 2, 50)
                remaining = " " * max(50 - (progress // 2), 0)
                sys.stdout.write(f"\r[{completed}{remaining}] {progress: 3}% @ {speed}{multiple}Bps")
                sys.stdout.flush()
    print("\n")

""" 
# Alternative file write with tqdm progressbar, requires tqdm:
def write_file(data, filename):
    from tqdm import tqdm
    with open(filename, "wb") as dl_file, tqdm(
        total=total_size,
        unit='iB',
        unit_scale=True,
        unit_divisor=1024
    ) as progress_bar:
        for chunk in data.iter_content(chunk_size=1024):
            if chunk:
                downloaded_size = dl_file.write(chunk)
                # write to disk
                dl_file.flush()
                progress_bar.update(downloaded_size)
"""

def model_name_to_service_name(model_name):
    service_name = model_name.replace("_", " ")
    service_name = re.sub(
        r"[- .]*[v]?[0-9\.]+(?:[a-z0-9\-]*)?$",
        "",
        service_name,
        re.I
    )
    service_name = re.sub(
        r"([A-Z])",
        lambda x: f" {x.group(0)}",
        service_name
    ).strip()
    service_name = re.sub(r"\s\s+", " ", service_name)
    return service_name

def download_model(model_path):
    # This does not check if the model_path actually exists
    filename = model_path
    if "/" in filename:
        # remove path prefix from filename
        filename = filename.split("/").pop(-1)
    # remove extension
    model_name, _ = os.path.splitext(filename)
    api_query = f"""{SERVICE}v1/models?query={
        model_name_to_service_name(model_name)
    }"""
    response = get_url(api_query)
    if not response:
        print(f"Could not get model info for {filename} :(")
        return
    model_info = response.json()
    if len(model_info.get("items", [])) == 0:
        print("No models found.")
        return
    for item in model_info["items"]:
        for version in item["modelVersions"]:
            for version_data in version["files"]:
                version_filename = version_data["name"]
                version_id = version_data["id"]
                print(f"Found model {version_id}: {version_filename}")
                if version_filename in [f"{model_name}.{x}" for x in ["safetensors", "ckpt"]]:
                    model_url = version_data["downloadUrl"]
                    version_file = get_url(model_url, headers={"Content-Disposition": None})
                    if version_file:
                        write_file(version_file, model_path)
                        print(f"{filename} saved!")
                        return

if __name__ == "__main__":
    download_model(sys.argv[1])

paboum commented 8 months ago

I will test it on my side after the weekend and let you know.

Meanwhile I created this ticket: https://github.com/bmaltais/kohya_ss/issues/1601 - which I believe can help avoid any of similar issues in the future, if it becomes a standard.

zixaphir commented 8 months ago

The main part of this issue should be resolved in the latest version. Changes to model update code will be addressed at a later time. If you wish to track that particular feature, feel free to open a new issue as a feature request. However, I do not wish to give users browsing the issue list the impression that updating is still broken.

zixaphir / Stable-Diffusion-Webui-Civitai-Helper

updating is broken #24