[Issue]: Сant run any generation 2 times bigger than base model native resolution. Artifacts line. Long gen. initialization. ARC A770

Issue Description

My problem is I cant run any generation 2 times bigger than base model native resolution: 1024 for SD1.5 (928 for example too) and 2048 for SDXL. Upscalers works, but I dont prefer it due to additional artifacts. So, It my first usage of Intel ARC gpu with OpenVino and its my first test runs. I dont understand whats the point, cause I have no such problems on NVIDIA GPU. Maybe it`s not a problem or rather trivial problem, but I lack the competence to solve it anyway.
Another problems are: 1.long time initialization of every generation which makes total generation speed absolutely lame. 2.line of artifacts on images. I know that a way to fix it exist, but I cant find a link to it.
image (2)
Version Platform Description

Win10 latest, ARC A770 with latest stable driver 101.5762, SD.Next latest, Python 3.11, Brave Browser
Relevant log output

10:34:44-878092 INFO     Launching browser
10:34:47-549993 INFO     MOTD: N/A
10:34:50-225669 DEBUG    UI themes available: type=Standard themes=12
10:34:51-955584 INFO     Browser session: user=None client=127.0.0.1 agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64)
                         AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36
10:35:15-172865 INFO     Select: model="dreamshaper_8 [879db523c3]"
10:35:15-174859 DEBUG    Load model: existing=False
                         target=C:\Users\asdas\Desktop\automatic\models\Stable-diffusion\dreamshaper_8.safetensors
                         info=None
10:35:15-377041 DEBUG    GC: utilization={'gpu': 0, 'ram': 43, 'threshold': 80} gc={'collected': 1466, 'saved': 0}
                         before={'gpu': 0, 'ram': 13.71} after={'gpu': 0, 'ram': 13.71, 'retries': 0, 'oom': 0}
                         device=cpu fn=unload_model_weights time=0.2
10:35:15-383119 DEBUG    Unload weights model: {'ram': {'used': 13.71, 'total': 31.93}}
10:35:15-844536 DEBUG    Diffusers loading:
                         path="C:\Users\asdas\Desktop\automatic\models\Stable-diffusion\dreamshaper_8.safetensors"
10:35:15-846538 INFO     Autodetect: model="Stable Diffusion" class=StableDiffusionPipeline
                         file="C:\Users\asdas\Desktop\automatic\models\Stable-diffusion\dreamshaper_8.safetensors"
                         size=2034MB
Loading pipeline components... 100% ------------------------------------------------ 6/6  [ 0:00:02 < 0:00:00 , 1 C/s ]
10:35:18-850456 DEBUG    Setting model: pipeline=StableDiffusionPipeline config={'low_cpu_mem_usage': True,
                         'torch_dtype': torch.float32, 'load_connected_pipeline': True, 'extract_ema': False, 'config':
                         'configs/sd15', 'use_safetensors': True, 'cache_dir':
                         'C:\\Users\\asdas\\.cache\\huggingface\\hub'}
10:35:18-857461 INFO     Load embeddings: loaded=0 skipped=0 time=0.00
10:35:18-859463 DEBUG    Setting model: enable VAE slicing
10:35:18-880481 INFO     Model compile: pipeline=StableDiffusionPipeline mode=default backend=openvino_fx
                         fullgraph=False compile=['Model', 'VAE']
10:35:18-883485 DEBUG    Model compile available backends: ['cudagraphs', 'inductor', 'onnxrt', 'openvino_fx',
                         'openxla', 'openxla_eval', 'tvm']
10:35:18-900501 INFO     Model compile: time=0.02
10:35:19-104684 DEBUG    GC: utilization={'gpu': 0, 'ram': 33, 'threshold': 80} gc={'collected': 104, 'saved': 0}
                         before={'gpu': 0, 'ram': 10.61} after={'gpu': 0, 'ram': 10.61, 'retries': 0, 'oom': 0}
                         device=cpu fn=load_diffuser time=0.2
10:35:19-113693 INFO     Load model: time=3.06 load=3.01 native=512 {'ram': {'used': 10.61, 'total': 31.93}}
10:35:19-116695 DEBUG    Setting changed: sd_model_checkpoint=dreamshaper_8 [879db523c3] progress=True
10:35:19-117697 DEBUG    Save: file="config.json" json=32 bytes=1377 time=0.001
10:35:27-547323 INFO     Base: class=StableDiffusionPipeline
10:35:27-550326 DEBUG    Sampler: sampler="Euler a" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
                         'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
                         'rescale_betas_zero_snr': False, 'timestep_spacing': 'linspace'}
10:35:28-485705 DEBUG    Torch generator: device=cpu seeds=[1758501461]
10:35:28-487708 DEBUG    Diffuser pipeline: StableDiffusionPipeline task=DiffusersTaskType.TEXT_2_IMAGE batch=1/1x1
                         set={'prompt_embeds': torch.Size([1, 77, 768]), 'negative_prompt_embeds': torch.Size([1, 77,
                         768]), 'guidance_scale': 6, 'num_inference_steps': 20, 'eta': 1.0, 'guidance_rescale': 0.7,
                         'output_type': 'latent', 'width': 1024, 'height': 1024, 'parser': 'Full parser'}
Progress ?it/s                                              0% 0/20 00:00 ? Base10:36:00-139138 DEBUG    Server: alive=True jobs=1 requests=356 uptime=91 memory=13.1/31.93 backend=Backend.DIFFUSERS
                         state=idle
Progress ?it/s                                              0% 0/20 01:05 ? Base
10:36:34-099404 ERROR    Processing: args={'prompt_embeds': tensor([[[-0.3826,  0.0188, -0.0631,  ..., -0.4869, -0.2943,  0.0626],
                                  [-0.0952, -1.3683,  0.3406,  ..., -0.6477,  1.1441,  0.6530],
                                  [ 0.5831, -0.6960, -0.4823,  ..., -0.8055, -1.0261, -1.4348],
                                  ...,
                                  [-0.5630,  0.4543,  0.0439,  ...,  0.5765, -0.7594, -0.4750],
                                  [-0.5772,  0.4717,  0.0423,  ...,  0.5898, -0.7722, -0.4663],
                                  [-0.5705,  0.4813,  0.1211,  ...,  0.6016, -0.7849, -0.4757]]]), 'negative_prompt_embeds': tensor([[[-0.3826,  0.0188, -0.0631,  ..., -0.4869, -0.2943,  0.0626],
                                  [-0.3221, -1.4899, -0.4587,  ...,  1.0639,  0.1340, -0.6948],
                                  [-0.3292, -1.4357, -0.4369,  ...,  1.1487, -0.0030, -0.5361],
                                  ...,
                                  [ 1.3220, -0.5007, -0.4426,  ...,  0.7232, -1.3899,  0.7433],
                                  [ 1.3292, -0.4887, -0.4393,  ...,  0.7425, -1.4021,  0.7457],
                                  [ 1.3397, -0.4578, -0.3970,  ...,  0.7448, -1.3316,  0.7183]]]), 'guidance_scale': 6, 'generator': [<torch._C.Generator object at 0x000002214F30EE10>],
                         'callback_on_step_end': <function diffusers_callback at 0x0000022144DC6DE0>, 'callback_on_step_end_tensor_inputs': ['latents', 'prompt_embeds', 'negative_prompt_embeds'],
                         'num_inference_steps': 20, 'eta': 1.0, 'guidance_rescale': 0.7, 'output_type': 'latent', 'width': 1024, 'height': 1024} Exception from src\inference\src\core.cpp:102:
                         [ GENERAL_ERROR ] [CL ext] Can not allocate nullptr for USM type.

10:36:34-117420 ERROR    Processing: RuntimeError
┌──────────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────────────────────────────────────────────┐
│ C:\Users\asdas\Desktop\automatic\modules\processing_diffusers.py:122 in process_diffusers                                                                                                                  │
│                                                                                                                                                                                                            │
│   121 │   │   else:                                                                                                                                                                                        │
│ > 122 │   │   │   output = shared.sd_model(**base_args)                                                                                                                                                    │
│   123 │   │   if isinstance(output, dict):                                                                                                                                                                 │
│                                                                                                                                                                                                            │
│ C:\Users\asdas\Desktop\automatic\venv\Lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context                                                                                                 │
│                                                                                                                                                                                                            │
│   114 │   │   with ctx_factory():                                                                                                                                                                          │
│ > 115 │   │   │   return func(*args, **kwargs)                                                                                                                                                             │
│   116                                                                                                                                                                                                      │
│                                                                                                                                                                                                            │
│ C:\Users\asdas\Desktop\automatic\venv\Lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py:1006 in __call__                                                                 │
│                                                                                                                                                                                                            │
│   1005 │   │   │   │   # predict the noise residual                                                                                                                                                        │
│ > 1006 │   │   │   │   noise_pred = self.unet(                                                                                                                                                             │
│   1007 │   │   │   │   │   latent_model_input,                                                                                                                                                             │
│                                                                                                                                                                                                            │
│ C:\Users\asdas\Desktop\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1511 in _wrapped_call_impl                                                                                              │
│                                                                                                                                                                                                            │
│   1510 │   │   else:                                                                                                                                                                                       │
│ > 1511 │   │   │   return self._call_impl(*args, **kwargs)                                                                                                                                                 │
│   1512                                                                                                                                                                                                     │
│                                                                                                                                                                                                            │
│ C:\Users\asdas\Desktop\automatic\venv\Lib\site-packages\torch\nn\modules\module.py:1520 in _call_impl                                                                                                      │
│                                                                                                                                                                                                            │
│   1519 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                             │
│ > 1520 │   │   │   return forward_call(*args, **kwargs)                                                                                                                                                    │
│   1521                                                                                                                                                                                                     │
│                                                                                                                                                                                                            │
│                                                                                          ... 19 frames hidden ...                                                                                          │
│ in forward:5                                                                                                                                                                                               │
│                                                                                                                                                                                                            │
│ C:\Users\asdas\Desktop\automatic\modules\intel\openvino\__init__.py:61 in __call__                                                                                                                         │
│                                                                                                                                                                                                            │
│    60 │   def __call__(self, *args):                                                                                                                                                                       │
│ >  61 │   │   result = openvino_execute(self.gm, *args, executor_parameters=self.executor_parameters, partition_id=self.partition_id, file_name=self.file_name)                                            │
│    62 │   │   return result                                                                                                                                                                                │
│                                                                                                                                                                                                            │
│ C:\Users\asdas\Desktop\automatic\modules\intel\openvino\__init__.py:328 in openvino_execute                                                                                                                │
│                                                                                                                                                                                                            │
│   327 │   │   else:                                                                                                                                                                                        │
│ > 328 │   │   │   compiled = openvino_compile(gm, *args, model_hash_str=model_hash_str, file_name=file_name)                                                                                               │
│   329 │   │   shared.compiled_model_state.compiled_cache[partition_id] = compiled                                                                                                                          │
│                                                                                                                                                                                                            │
│ C:\Users\asdas\Desktop\automatic\modules\intel\openvino\__init__.py:256 in openvino_compile                                                                                                                │
│                                                                                                                                                                                                            │
│   255 │                                                                                                                                                                                                    │
│ > 256 │   compiled_model = core.compile_model(om, device)                                                                                                                                                  │
│   257 │   return compiled_model                                                                                                                                                                            │
│                                                                                                                                                                                                            │
│ C:\Users\asdas\Desktop\automatic\venv\Lib\site-packages\openvino\runtime\ie_api.py:547 in compile_model                                                                                                    │
│                                                                                                                                                                                                            │
│   546 │   │   return CompiledModel(                                                                                                                                                                        │
│ > 547 │   │   │   super().compile_model(model, device_name, {} if config is None else config),                                                                                                             │
│   548 │   │   )                                                                                                                                                                                            │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
RuntimeError: Exception from src\inference\src\core.cpp:102:
[ GENERAL_ERROR ] [CL ext] Can not allocate nullptr for USM type.

10:36:34-711471 INFO     Processed: images=0 time=67.18 its=0.00 memory={'ram': {'used': 6.02, 'total': 31.93}}
10:38:00-197861 DEBUG    Server: alive=True jobs=1 requests=418 uptime=211 memory=6.01/31.93 backend=Backend.DIFFUSERS state=idle
10:40:00-236173 DEBUG    Server: alive=True jobs=1 requests=420 uptime=331 memory=6.01/31.93 backend=Backend.DIFFUSERS state=idle
10:42:00-271589 DEBUG    Server: alive=True jobs=1 requests=429 uptime=451 memory=6.01/31.93 backend=Backend.DIFFUSERS state=idle
Backend

Diffusers
UI

Standard
Branch

Master
Model

StableDiffusion 1.5
Acknowledgements

[X] I have read the above and searched for existing issues
[X] I confirm that this is classified correctly and its not an extension issue
vladmandic / automatic