can't save npz files? - Githubissues

SuroshAhmadZobair commented 1 year ago

Hi When i run python scripts/image_sample.py --dataset rplan --batch_size 32 --set_name eval --target_set 8 --model_path ckpts/exp/model250000.pt --num_samples 64 i get the following error: FileNotFoundError: [Errno 2] No such file or directory: 'processed_rplan/rplan_train_8_cndist.npz'

I could not find npz file in this repo either. https://github.com/sepidsh/Housegan-data-reader Can you please guide me?

Great work Thanks in advance!

SuroshAhmadZobair commented 1 year ago

Oops. just create a dir called processed_rplan in scripts dir.

cheers!

gemyerst commented 1 year ago

Hi! I am getting the same error as you, I created the processed_rplan dir in scripts as recommended, but it is still looking for the file rplan_train_8_cndist.npz in this dir. I also can't find any files in this format in any of the repos. Are there any other steps you took to resolve this error? Thanks!

I ran:

python image_sample.py --dataset rplan --batch_size 32 --set_name eval --target_set 8 --model_path ckpts/exp/model250000.pt --num_samples 64

& received this error:

loading eval of target set 8
Traceback (most recent call last):
  File "image_sample.py", line 378, in <module>
    main()
  File "image_sample.py", line 329, in main
    data_sample, model_kwargs = next(data)
  File "/workspaces/house_diffusion/house_diffusion/rplanhg_datasets.py", line 31, in load_rplanhg_data
    dataset = RPlanhgDataset(set_name, analog_bit, target_set)
  File "/workspaces/house_diffusion/house_diffusion/rplanhg_datasets.py", line 98, in __init__
    cnumber_dist = np.load(f'processed_rplan/rplan_train_{target_set}_cndist.npz', allow_pickle=True)['cnumber_dist'].item()
  File "/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py", line 428, in load
    fid = open(os_fspath(file), "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'processed_rplan/rplan_train_8_cndist.npz'

SuroshAhmadZobair commented 1 year ago

Hi If I am not mistaken, you have to run image_train.py for """rplan_train_8_cndist.npz""" to be generated. python image_train.py --dataset rplan --batch_size 32 --set_name train --target_set 8

@aminshabani Any guidance?

For how long should 'image_train.py' run though? I do not know. Waiting for a response here. https://github.com/aminshabani/house_diffusion/issues/6

sakmalh commented 1 year ago

Yes you are right. When you run the train script the train_8 file would be generated. And then you run the eval script to generate.

SuroshAhmadZobair commented 1 year ago

@sakmalh Thanks for your response image_train.py script keeps running forever. if I interrupt it, then processed_rplan/rplan_eval_8_syn.npz will not be generated.

Can you please share your strategy?

sakmalh commented 1 year ago

@SuroshAhmadZobair It does not run forever. You dont need to train it. You can interupt it right after the dataset is processed and it starts the training.

gemyerst commented 1 year ago

When I am tryin to train to get the .npz file, I am getting the error below. I processed around 100 images just to test the training.

root@7320bb3cfc76:/workspaces/house_diffusion# python scripts/image_train.py --dataset rplan --batch_size 32 --set_name train --target_set 8 Logging to ckpts/openai_2023_06_17_17_59_10_844433 creating model and diffusion... Number of model parameters: 26541330 COSINE creating data loader... training... loading train of target set 8 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 92/92 [00:00<00:00, 164.95it/s] 27%|████████████████████████████████████████████████ | 19/70 [00:00<00:00, 146.52it/s] Traceback (most recent call last): File "scripts/image_train.py", line 90, in <module> main() File "scripts/image_train.py", line 47, in main TrainLoop( File "/workspaces/house_diffusion/house_diffusion/train_util.py", line 160, in run_loop batch, cond = next(self.data) File "/workspaces/house_diffusion/house_diffusion/rplanhg_datasets.py", line 31, in load_rplanhg_data dataset = RPlanhgDataset(set_name, analog_bit, target_set) File "/workspaces/house_diffusion/house_diffusion/rplanhg_datasets.py", line 152, in __init__ graph_nodes, graph_edges, rooms_mks = self.build_graph(rms_type, fp_eds, eds_to_rms) File "/workspaces/house_diffusion/house_diffusion/rplanhg_datasets.py", line 473, in build_graph poly = self.make_sequence(np.array([fp_eds[l][:4] for l in eds_poly]))[0] File "/workspaces/house_diffusion/house_diffusion/rplanhg_datasets.py", line 393, in make_sequence v_curr = tuple(edges[0][:2]) IndexError: index 0 is out of bounds for axis 0 with size 0 root@7320bb3cfc76:/workspaces/house_diffusion#

sakmalh commented 1 year ago

@gemyerst Not all images are supported for training. Before training process the jsons and delete the ones which throws an exception. Out of the 80,000 files, you'll be getting around 60,000.

But if you need the script to verify the files. Would be able to update here in 10hrs.

gemyerst commented 1 year ago

@sakmalh That would be super helpful, thank you! Otherwise I'll try and write one quickly.

sakmalh commented 1 year ago

@gemyerst Sorry for the delayed response. Here you go.

import json
import numpy as np
from glob import glob

def reader(filename):
    with open(filename) as f:
        info = json.load(f)
    rms_bbs = np.asarray(info['boxes'])
    fp_eds = info['edges']
    rms_type = info['room_type']
    eds_to_rms = info['ed_rm']
    s_r = 0
    for rmk in range(len(rms_type)):
        if rms_type[rmk] != 17:
            s_r = s_r + 1
    rms_bbs = np.array(rms_bbs) / 256.0
    fp_eds = np.array(fp_eds) / 256.0
    fp_eds = fp_eds[:, :4]
    tl = np.min(rms_bbs[:, :2], 0)
    br = np.max(rms_bbs[:, 2:], 0)
    shift = (tl + br) / 2.0 - 0.5
    rms_bbs[:, :2] -= shift
    rms_bbs[:, 2:] -= shift
    fp_eds[:, :2] -= shift
    fp_eds[:, 2:] -= shift
    tl -= shift
    br -= shift
    eds_to_rms_tmp = []

    for l in range(len(eds_to_rms)):
        eds_to_rms_tmp.append([eds_to_rms[l][0]])

    return rms_type, fp_eds, rms_bbs, eds_to_rms, eds_to_rms_tmp

file_list = glob('/home/akmal/APIIT/FYP Code/Housegan-data-reader/sample_out/*')
# with open('file_list.txt','r') as f:
#     lines = f.readlines()

lines = file_list

out_size = 64
length_edges = []
subgraphs = []
for line in lines:
    a = []
    with open(line) as f2:
        rms_type, fp_eds, rms_bbs, eds_to_rms, eds_to_rms_tmp = reader(line)

    eds_to_rms_tmp = []
    for l in range(len(eds_to_rms)):
        eds_to_rms_tmp.append([eds_to_rms[l][0]])

    rms_masks = []
    im_size = 256
    fp_mk = np.zeros((out_size, out_size))
    nodes = rms_type
    for k in range(len(nodes)):
        eds = []
        for l, e_map in enumerate(eds_to_rms_tmp):
            if (k in e_map):
                eds.append(l)
        b = []
        for eds_poly in [eds]:
            length_edges.append((line, np.array([fp_eds[l][:4] for l in eds_poly])))
chk = [x.shape for x in np.array(length_edges)[:, 1]]
idx = [i for i, x in enumerate(chk) if len(x) != 2]
final = np.array(length_edges)[idx][:, 0].tolist()
final = [x.replace('\n', '') for x in final]

import os

for fin in final:
    try:
        os.remove(fin)
    except:
        pass

gemyerst commented 1 year ago

@sakmalh this is awesome thank you so much for sharing! I am able to process all the data now and train the model :)

Fatemeh-Mostafavi commented 1 year ago

@gemyerst Sorry for the delayed response. Here you go.

import json
import numpy as np
from glob import glob

def reader(filename):
    with open(filename) as f:
        info = json.load(f)
    rms_bbs = np.asarray(info['boxes'])
    fp_eds = info['edges']
    rms_type = info['room_type']
    eds_to_rms = info['ed_rm']
    s_r = 0
    for rmk in range(len(rms_type)):
        if rms_type[rmk] != 17:
            s_r = s_r + 1
    rms_bbs = np.array(rms_bbs) / 256.0
    fp_eds = np.array(fp_eds) / 256.0
    fp_eds = fp_eds[:, :4]
    tl = np.min(rms_bbs[:, :2], 0)
    br = np.max(rms_bbs[:, 2:], 0)
    shift = (tl + br) / 2.0 - 0.5
    rms_bbs[:, :2] -= shift
    rms_bbs[:, 2:] -= shift
    fp_eds[:, :2] -= shift
    fp_eds[:, 2:] -= shift
    tl -= shift
    br -= shift
    eds_to_rms_tmp = []

    for l in range(len(eds_to_rms)):
        eds_to_rms_tmp.append([eds_to_rms[l][0]])

    return rms_type, fp_eds, rms_bbs, eds_to_rms, eds_to_rms_tmp

file_list = glob('/home/akmal/APIIT/FYP Code/Housegan-data-reader/sample_out/*')
# with open('file_list.txt','r') as f:
#     lines = f.readlines()

lines = file_list

out_size = 64
length_edges = []
subgraphs = []
for line in lines:
    a = []
    with open(line) as f2:
        rms_type, fp_eds, rms_bbs, eds_to_rms, eds_to_rms_tmp = reader(line)

    eds_to_rms_tmp = []
    for l in range(len(eds_to_rms)):
        eds_to_rms_tmp.append([eds_to_rms[l][0]])

    rms_masks = []
    im_size = 256
    fp_mk = np.zeros((out_size, out_size))
    nodes = rms_type
    for k in range(len(nodes)):
        eds = []
        for l, e_map in enumerate(eds_to_rms_tmp):
            if (k in e_map):
                eds.append(l)
        b = []
        for eds_poly in [eds]:
            length_edges.append((line, np.array([fp_eds[l][:4] for l in eds_poly])))
chk = [x.shape for x in np.array(length_edges)[:, 1]]
idx = [i for i, x in enumerate(chk) if len(x) != 2]
final = np.array(length_edges)[idx][:, 0].tolist()
final = [x.replace('\n', '') for x in final]

import os

for fin in final:
    try:
        os.remove(fin)
    except:
        pass

Hello, thank you for sharing the code for cleaning json files. I get this error while running it. I'm not sure where the problem is.

C:\Users\fmostafavi.conda\envs\HouseDiffusion\python.exe "C:\Users\fmostafavi.conda\envs\HouseDiffusion\Lib\site-packages\torch\utils\data\json cleaner.py" Traceback (most recent call last): File "C:\Users\fmostafavi.conda\envs\HouseDiffusion\Lib\site-packages\torch\utils\data\json cleaner.py", line 67, in chk = [x.shape for x in np.array(length_edges)[:, 1]] IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

I appreciate it if you can help with that.

sakmalh commented 1 year ago

@Fatemeh-Mostafavi I am sorry without much context I cant help you with that. What I can do is give you the JSON zip file. So you can use it directly. Ill drop the google drive link here by tomorrow.

Fatemeh-Mostafavi commented 1 year ago

@sakmalh Thank you so much. That would be great!

Fatemeh-Mostafavi commented 1 year ago

@sakmalh Thanks for your response image_train.py script keeps running forever. if I interrupt it, then processed_rplan/rplan_eval_8_syn.npz will not be generated.

Can you please share your strategy?

Hi, I have the same problem. When I run the training code, the "rplan_eval_8_syn.npz" is not generated instantly as the other two npz files (rplan_train_8_cndist.npz and rplan_train_8.npz). Have you found the solution?

sakmalh commented 1 year ago

@Fatemeh-Mostafavi I'm sorry I was unable to reply with the google drive link. Because the data I trained on has some issues as well. Although it trains successfully the output is less accurate than the model given by @aminshabani . I suggest to use the model given by him. Plus in my forked repo I have changed the code to get the number of rooms and room_types and corners and connections between rooms to generate. Also I have provided a Frontend in React where you can drag and drop the rooms and connect to output plans. It also has metrics. Please drop a star to mine and give a follow as well :).

SuroshAhmadZobair commented 1 year ago

Sounds Awesome @sakmalh

Can you please update the readme file on your fork?

Thanks

sakmalh commented 1 year ago

@SuroshAhmadZobair and @Fatemeh-Mostafavi You could check my https://github.com/sakmalh/FE-Diffusion and https://github.com/sakmalh/house_diffusion. Updated the readme file.

HUIZI66889 commented 1 year ago

@sakmalh Thanks for your response image_train.py script keeps running forever. if I interrupt it, then processed_rplan/rplan_eval_8_syn.npz will not be generated.

Can you please share your strategy?

Hello, when I run image_train.py, I also encounter the situation that it keeps running, how do you solve it? Looking forward to your reply, thank you very much!

HUIZI66889 commented 1 year ago

@SuroshAhmadZobair It does not run forever. You dont need to train it. You can interupt it right after the dataset is processed and it starts the training.

Hello, when I run image_train.py, I also encounter the situation that it keeps running, how do you solve it? Looking forward to your reply, thank you very much!

liangxuejingjing commented 1 year ago

@gemyerst Sorry for the delayed response. Here you go.

import json
import numpy as np
from glob import glob

def reader(filename):
    with open(filename) as f:
        info = json.load(f)
    rms_bbs = np.asarray(info['boxes'])
    fp_eds = info['edges']
    rms_type = info['room_type']
    eds_to_rms = info['ed_rm']
    s_r = 0
    for rmk in range(len(rms_type)):
        if rms_type[rmk] != 17:
            s_r = s_r + 1
    rms_bbs = np.array(rms_bbs) / 256.0
    fp_eds = np.array(fp_eds) / 256.0
    fp_eds = fp_eds[:, :4]
    tl = np.min(rms_bbs[:, :2], 0)
    br = np.max(rms_bbs[:, 2:], 0)
    shift = (tl + br) / 2.0 - 0.5
    rms_bbs[:, :2] -= shift
    rms_bbs[:, 2:] -= shift
    fp_eds[:, :2] -= shift
    fp_eds[:, 2:] -= shift
    tl -= shift
    br -= shift
    eds_to_rms_tmp = []

    for l in range(len(eds_to_rms)):
        eds_to_rms_tmp.append([eds_to_rms[l][0]])

    return rms_type, fp_eds, rms_bbs, eds_to_rms, eds_to_rms_tmp

file_list = glob('/home/akmal/APIIT/FYP Code/Housegan-data-reader/sample_out/*')
# with open('file_list.txt','r') as f:
#     lines = f.readlines()

lines = file_list

out_size = 64
length_edges = []
subgraphs = []
for line in lines:
    a = []
    with open(line) as f2:
        rms_type, fp_eds, rms_bbs, eds_to_rms, eds_to_rms_tmp = reader(line)

    eds_to_rms_tmp = []
    for l in range(len(eds_to_rms)):
        eds_to_rms_tmp.append([eds_to_rms[l][0]])

    rms_masks = []
    im_size = 256
    fp_mk = np.zeros((out_size, out_size))
    nodes = rms_type
    for k in range(len(nodes)):
        eds = []
        for l, e_map in enumerate(eds_to_rms_tmp):
            if (k in e_map):
                eds.append(l)
        b = []
        for eds_poly in [eds]:
            length_edges.append((line, np.array([fp_eds[l][:4] for l in eds_poly])))
chk = [x.shape for x in np.array(length_edges)[:, 1]]
idx = [i for i, x in enumerate(chk) if len(x) != 2]
final = np.array(length_edges)[idx][:, 0].tolist()
final = [x.replace('\n', '') for x in final]

import os

for fin in final:
    try:
        os.remove(fin)
    except:
        pass

Hello, thank you for sharing the code for cleaning json files. I get this error while running it. I'm not sure where the problem is.

C:\Users\fmostafavi.conda\envs\HouseDiffusion\python.exe "C:\Users\fmostafavi.conda\envs\HouseDiffusion\Lib\site-packages\torch\utils\data\json cleaner.py" Traceback (most recent call last): File "C:\Users\fmostafavi.conda\envs\HouseDiffusion\Lib\site-packages\torch\utils\data\json cleaner.py", line 67, in chk = [x.shape for x in np.array(length_edges)[:, 1]] IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

I appreciate it if you can help with that.

@sakmalh Thank you so much. That would be great!

Hi, I met this problem too. But JSON Clean also had an error"raise JSONDecodeError". I want to know how to deal with it.

CQUxjmushan commented 1 year ago

@Fatemeh-Mostafavi I'm sorry I was unable to reply with the google drive link. Because the data I trained on has some issues as well. Although it trains successfully the output is less accurate than the model given by @aminshabani . I suggest to use the model given by him. Plus in my forked repo I have changed the code to get the number of rooms and room_types and corners and connections between rooms to generate. Also I have provided a Frontend in React where you can drag and drop the rooms and connect to output plans. It also has metrics. Please drop a star to mine and give a follow as well :).

Hi,I run the training code and the output of the model is also less accurate than the model given by @aminshabani. In the paper, a batch size of 512 was used to train 250k steps. I trained 500k steps using a 256 batch size. Could you please tell me your batch and steps during training so that I can investigate if this is the cause？ Thanks !

albertotono commented 12 months ago

@gemyerst Sorry for the delayed response. Here you go.

import json
import numpy as np
from glob import glob

def reader(filename):
    with open(filename) as f:
        info = json.load(f)
    rms_bbs = np.asarray(info['boxes'])
    fp_eds = info['edges']
    rms_type = info['room_type']
    eds_to_rms = info['ed_rm']
    s_r = 0
    for rmk in range(len(rms_type)):
        if rms_type[rmk] != 17:
            s_r = s_r + 1
    rms_bbs = np.array(rms_bbs) / 256.0
    fp_eds = np.array(fp_eds) / 256.0
    fp_eds = fp_eds[:, :4]
    tl = np.min(rms_bbs[:, :2], 0)
    br = np.max(rms_bbs[:, 2:], 0)
    shift = (tl + br) / 2.0 - 0.5
    rms_bbs[:, :2] -= shift
    rms_bbs[:, 2:] -= shift
    fp_eds[:, :2] -= shift
    fp_eds[:, 2:] -= shift
    tl -= shift
    br -= shift
    eds_to_rms_tmp = []

    for l in range(len(eds_to_rms)):
        eds_to_rms_tmp.append([eds_to_rms[l][0]])

    return rms_type, fp_eds, rms_bbs, eds_to_rms, eds_to_rms_tmp

file_list = glob('/home/akmal/APIIT/FYP Code/Housegan-data-reader/sample_out/*')
# with open('file_list.txt','r') as f:
#     lines = f.readlines()

lines = file_list

out_size = 64
length_edges = []
subgraphs = []
for line in lines:
    a = []
    with open(line) as f2:
        rms_type, fp_eds, rms_bbs, eds_to_rms, eds_to_rms_tmp = reader(line)

    eds_to_rms_tmp = []
    for l in range(len(eds_to_rms)):
        eds_to_rms_tmp.append([eds_to_rms[l][0]])

    rms_masks = []
    im_size = 256
    fp_mk = np.zeros((out_size, out_size))
    nodes = rms_type
    for k in range(len(nodes)):
        eds = []
        for l, e_map in enumerate(eds_to_rms_tmp):
            if (k in e_map):
                eds.append(l)
        b = []
        for eds_poly in [eds]:
            length_edges.append((line, np.array([fp_eds[l][:4] for l in eds_poly])))
chk = [x.shape for x in np.array(length_edges)[:, 1]]
idx = [i for i, x in enumerate(chk) if len(x) != 2]
final = np.array(length_edges)[idx][:, 0].tolist()
final = [x.replace('\n', '') for x in final]

import os

for fin in final:
    try:
        os.remove(fin)
    except:
        pass

I solved the "json.decoder.JSONDecodeError: Extra data: line 1 column 6 (char 5)" modifying the code to accept only json files

but now I am facing this issue

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (1042995, 2) + inhomogeneous part.

adeerkhan commented 6 months ago

Hi everyone, anyone knows this issue:

Warning: batch size is bigger than the data size. Setting batch size to data size
Traceback (most recent call last):
  File "/workspace/house_diffusion/scripts/image_sample.py", line 378, in <module>
    main()
  File "/workspace/house_diffusion/scripts/image_sample.py", line 352, in main
    fid_score = calculate_fid_given_paths(['outputs/gt', 'outputs/pred'], 64, 'cuda', 2048)
  File "/root/miniconda3/envs/py9/lib/python3.9/site-packages/pytorch_fid/fid_score.py", line 259, in calculate_fid_given_paths
    m1, s1 = compute_statistics_of_path(paths[0], model, batch_size,
  File "/root/miniconda3/envs/py9/lib/python3.9/site-packages/pytorch_fid/fid_score.py", line 243, in compute_statistics_of_path
    m, s = calculate_activation_statistics(files, model, batch_size,
  File "/root/miniconda3/envs/py9/lib/python3.9/site-packages/pytorch_fid/fid_score.py", line 228, in calculate_activation_statistics
    act = get_activations(files, model, batch_size, dims, device, num_workers)
  File "/root/miniconda3/envs/py9/lib/python3.9/site-packages/pytorch_fid/fid_score.py", line 122, in get_activations
    dataloader = torch.utils.data.DataLoader(dataset,
  File "/root/miniconda3/envs/py9/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 357, in __init__
    batch_sampler = BatchSampler(sampler, batch_size, drop_last)
  File "/root/miniconda3/envs/py9/lib/python3.9/site-packages/torch/utils/data/sampler.py", line 232, in __init__
    raise ValueError("batch_size should be a positive integer value, "
ValueError: batch_size should be a positive integer value, but got batch_size=0

aminshabani / house_diffusion

can't save npz files? #5