app.py improvements - Githubissues

Hi there,

A year or so ago I tried to implement my own transformer module and used this repository as a reference. Recently I decided to revisit it in order to see maybe there is something that I can help to improve.

And while I was reading app.py file, these lines confused me:

if "CUDA_PER_PROCESS_MEMORY_FRACTION" in os.environ:
    try:
        cuda_per_process_memory_fraction = float(os.getenv("CUDA_PER_PROCESS_MEMORY_FRACTION"))
    except ValueError:
        logger.error(f"Invalid CUDA_PER_PROCESS_MEMORY_FRACTION (should be between 0.0-1.0)")

I mean why in the exception the message says that Invalid CUDA_PER_PROCESS_MEMORY_FRACTION (should be between 0.0-1.0), when this exception will occur if casting to float fails? Plus there is:

if 0.0 <= cuda_per_process_memory_fraction <= 1.0:
    logger.info(f"CUDA_PER_PROCESS_MEMORY_FRACTION set to {cuda_per_process_memory_fraction}")

what if the value is not in the range? By the way, I don't think that CUDA_PER_PROCESS_MEMORY_FRACTION with value of 0.0 is a valid one (don't use memory at all ?😕)

Plus there are somewhat clunky checks, like:

if cuda_env is not None and cuda_env == "true" or cuda_env == "1":

when you can do it easier:

if cuda_env in ("true", "1"):

The caveat is that I don't have access to a GPU at the moment, so I cannot check it on my machine. That's why I'm writting it as an Issue, rather than PR.

So my take on app.py:startup_event:

@app.on_event("startup")
def startup_event():
    global vec
    global meta_config

    cuda_per_process_memory_fraction = x if (x := os.getenv("CUDA_PER_PROCESS_MEMORY_FRACTION")) is not None else 1.0
    try:
        cuda_per_process_memory_fraction = float(cuda_per_process_memory_fraction)
    except ValueError:
        logger.error(f"Cannot cast '{cuda_per_process_memory_fraction}' to float.")
    if not (0.0 < cuda_per_process_memory_fraction <= 1.0):
        logger.error("Invalid cuda_per_process_memory_fraction (should be in range (0.0, 1.0]")

    if os.getenv("ENABLE_CUDA") in ("true", "1"):
        cuda_support = True
        cuda_core = os.getenv("CUDA_CORE") or "cuda:0"
        logger.info(f"Cuda_core is set to {cuda_core}")
    else:
        cuda_support = False
        cuda_core = ""
        logger.info("Running on CPU")

    direct_tokenize = os.getenv("T2V_TRANSFORMERS_DIRECT_TOKENIZE") in ("true", "1")

    meta_config = Meta('./models/model')
    vec = Vectorizer('./models/model', cuda_support, cuda_core, cuda_per_process_memory_fraction,
                     meta_config.getModelType(), meta_config.get_architecture(), direct_tokenize)

P.S. As I stated in the beginning, a year or so ago I tried implementing my own transformers module. It somewhat deviated from what we can find in this repository and I would like to incorporate my changes, of course if someone from core team/maintainers decides that my code or some parts of it deserves to be included in this repo. Here is the link to my repository. It can be perhaps migration to poetry, or differently wrote tests, or something else.

weaviate / t2v-transformers-models

app.py improvements #49