THUDM / Inf-DiT

Official implementation of Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer
Apache License 2.0
368 stars 18 forks source link

Run error, can't generate anything. #11

Open wikeeyang opened 4 months ago

wikeeyang commented 4 months ago

Hi, this is a great project, but when I try in my enviroment as below: Ubuntu 22.04, conda Python 3.11.9 torch 2.3.0 cuda 12.4 and other as same as requirement.txt

I try cli input prompt, but can't run rightly as below: 0it [00:49, ?it/s]Please input English text (Ctrl-D quit): A digital painting of a young goddess with flower and fruit adornments evoking symbolic metaphors. rank0: Traceback (most recent call last): rank0: File "/home/tkadmin/InfDiT/generate_t2i_sr.py", line 210, in

rank0: File "/home/tkadmin/InfDiT/generate_t2i_sr.py", line 130, in main rank0: for index, [lr_image, image_name] in tqdm(enumerate(data_iter)): rank0: File "/home/tkadmin/miniconda3/envs/thudit/lib/python3.11/site-packages/tqdm/std.py", line 1181, in iter rank0: for obj in iterable: rank0: File "/home/tkadmin/InfDiT/generate_t2i_sr.py", line 35, in read_from_cli rank0: image = Image.open(x).convert('RGB')

rank0: File "/home/tkadmin/miniconda3/envs/thudit/lib/python3.11/site-packages/PIL/Image.py", line 3247, in open rank0: fp = builtins.open(filename, "rb")

rank0: FileNotFoundError: [Errno 2] No such file or directory: 'A digital painting of a young goddess with flower and fruit adornments evoking symbolic metaphors' DONE on ubu

and then I put above prompt to input01.txt and modify generate_sr_big_cli.sh fir txt input_type, but also error as below: (thudit) tkadmin@ubu:~/InfDiT$ bash generate_sr_big_cli.sh RUN on ubu, CUDA_VISIBLE_DEVICES=0 WORLD_SIZE=1 RANK=0 MASTER_ADDR=localhost MASTER_PORT=23301 LOCAL_RANK=0 python generate_t2i_sr.py --input-type txt --input-path input01.txt --inference_type full --block_batch 4 --experiment-name generate --mode inference --inference-batch-size 1 --image-size 512 --input-time adaln --nogate --no-crossmask --bf16 --num-layers 28 --vocab-size 1 --hidden-size 1280 --num-attention-heads 16 --hidden-dropout 0. --attention-dropout 0. --in-channels 6 --out-channels 3 --cross-attn-hidden-size 640 --patch-size 4 --config-path configs/text2image-sr.yaml --max-sequence-length 256 --layernorm-epsilon 1e-6 --layernorm-order 'pre' --model-parallel-size 1 --tokenizer-type 'fake' --random-position --qk-ln --out-dir output --network ckpt/mp_rank_00_model_states.pt --round 32 --init_noise --image-condition --vector-dim 768 --re-position --cross-lr --seed 22611 --infer_sr_scale 4 [2024-06-14 14:23:49,098] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] async_io: please install the libaio-dev package with apt [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. [WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH [WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3 [WARNING] using untested triton version (2.3.0), only 1.0.0 is known to be compatible [2024-06-14 14:23:50,203] [WARNING] No training data specified [2024-06-14 14:23:50,203] [WARNING] No train_iters (recommended) or epochs specified, use default 10k iters. [2024-06-14 14:23:50,203] [INFO] using world size: 1 and model-parallel size: 1 [2024-06-14 14:23:50,203] [INFO] > padded vocab (size: 1) with 127 dummy tokens (new size: 128) [2024-06-14 14:23:50,206] [INFO] [RANK 0] > initializing model parallel with size 1 [2024-06-14 14:23:50,206] [INFO] [RANK 0] You didn't pass in LOCAL_WORLD_SIZE environment variable. We use the guessed LOCAL_WORLD_SIZE=1. If this is wrong, please pass the LOCAL_WORLD_SIZE manually. Loading network from "ckpt/mp_rank_00_model_states.pt"... [2024-06-14 14:23:50,207] [INFO] [RANK 0] building DiffusionEngine model ... --------use random position-------- warning: cross_attn_hidden_size is set but is_decoder is False --------use qk_ln-------- [2024-06-14 14:24:12,221] [INFO] [RANK 0] > number of parameters on model parallel rank 0: 1096323441 INFO:sat:[RANK 0] > number of parameters on model parallel rank 0: 1096323441 Loading Fished! rank: 0 world_size: 1 0it 00:00, ?it/s: Traceback (most recent call last): rank0: File "/home/tkadmin/InfDiT/generate_t2i_sr.py", line 210, in

rank0: File "/home/tkadmin/InfDiT/generate_t2i_sr.py", line 130, in main rank0: for index, [lr_image, image_name] in tqdm(enumerate(data_iter)): rank0: File "/home/tkadmin/miniconda3/envs/thudit/lib/python3.11/site-packages/tqdm/std.py", line 1181, in iter rank0: for obj in iterable: rank0: File "/home/tkadmin/InfDiT/generate_t2i_sr.py", line 58, in read_from_file rank0: image = Image.open(line).convert('RGB')

rank0: File "/home/tkadmin/miniconda3/envs/thudit/lib/python3.11/site-packages/PIL/Image.py", line 3247, in open rank0: fp = builtins.open(filename, "rb")

rank0: FileNotFoundError: [Errno 2] No such file or directory: 'A' DONE on ubu

please help me to analyze what cause of the error, thanks a lot!

yzy-thu commented 4 months ago

Since Inf-DiT is an upsampling model, you should enter the path of the low-resolution image here. The origin tip is a bit misleading I've updated it

wikeeyang commented 4 months ago

OK, Thanks for the reply rapidly, I will try again...