Closed joshdance closed 1 year ago
I was able to run the training just by calling through to the main.py
script as in the docs. However, the performance is absolute garbage on my 16GB M1 Air (10 minutes per iteration, vs <1s per iteration on a runpod 3090).
I think this was likely all swapping - may be better on a machine with more RAM.
Yeah, I think on that page the big "WINDOWS" section should really just be a smaller note, and the instructions for running the script are the same across platforms. It reads awkwardly right now.
Separately, I'd love to get a more detailed "how to successfully name and use your embedding after training it" sort of instruction on that page, and would be happy to document and PR that once I understand it.
I've been able to train an embedding (using that tiny dog that's in all the various online examples) using the script from that page in the invoke docs ... but I cannot figure out how to get invoke to actually use the embedding from a prompt, despite running the CLI with the --embedding_path
option on that page. I am able to take the .pt
file generated by the invoke/main.py script, copy it into a separate a111 install, and get that to generate images of the dog (struggling to get it to make anything OTHER than the dog), which makes me think that the training did work correctly but it's just not loading correctly or I'm not prompting it correctly.
I've tried lots of variations of "a photo of ", "a photo of dog", "a photo of my_dog" (I used my_dog
as the name during training) ... can't get it to generate.
yes, these docs need updating on that point too. the argument is --embeddings
or you can put them in an embeddings
folder (next to models and outputs)
the embedding itself is triggered by writing its file name inside <> angle brackets so eg my_dog.pt can be triggered with
if you check the terminal output while invoke is running there should be a list on startup of loaded embeddings.
From the code, it looks like --embeddings
controls whether the feature is turned on or off; and that --embedding_path
(aliased as --embedding_directory
supplies either a dir full of embeddings, or a specific embedding file.
That said, yes - when I run with something like python scripts/invoke.py --embedding_path ./logs/train2022-12-11T11-01-05_plpl/checkpoints/embeddings.pt
I do see this during script launch: Current embedding manager terms: *
... but then I'm unable to produce any image which look even vaguely like the trained model (Despite there being a bunch of sample_...
images in the images folder in the logs dir for that model which do look more or less like the trained images).
“*” is the default model, it’s not something that has been loaded. if you’re opening a file called embeddings.pt then the trigger word will be <embeddings>
and if you’re not seeing that on load then your pt file isn’t being loaded..
Thanks, I tried various options and can confirm that I do see the expected list in the "Current embedding manager terms" output during load. That is, when I have nothing in embeddings folder, I see only *
, when I put files in the folder, or symlink files from the folder, or supply arg on command line, I do see the relevant file names added to that list (ie, adding a embed.pt file to the folder shows *, embed
during load).
All that said, the generated images are still not producing anything resembling my trained images when I attempt to use them from the invoke CLI (or web UI). But like I said before, when I copy these pt files into an A1111 install, they do resemble the trained images when I use the prompts, so I know the trained files must be "working" in some sense.
Is it correct to say that for the purposes of using the embeddings in a prompt, it genuinely doesn't matter how they were created or what the placeholder given when they were created is ... but that invoke is just going to use the filename (from the embeddings folder or supplied as option) as the placeholder value for a given embedding?
Is it correct to say that for the purposes of using the embeddings in a prompt, it genuinely doesn't matter how they were created or what the placeholder given when they were created is ... but that invoke is just going to use the filename (from the embeddings folder or supplied as option) as the placeholder value for a given embedding?
that is correct, yes. they should be working, i'm sorry they're not. i just double-checked the code, i was wrong about the angle brackets but you should definitely be seeing the embed in operation if you use the trigger term embed
for embed.pt
. you can try doing the following - rename the file to my-embed.pt
, run the Invoke CLI rather than the web UI, and try entering
invoke> "a photograph of a my-embed" -s 15 -A k_heun -t
The -t
flag tells Invoke to log the tokenization - if it's working you should see the words of your prompt with different colors for each word, except my-embed
should all be one colour.
Thanks for all that, and your help/patience!
RE: the naming, here's something else I tried locally...
nebula
(style) and monster-toy
(concept) .bin files from the sd-concepts librarynebula.bin
and monster.bin
and my-embed.pt
to the the two downloaded bin files and my generated pt file.Given that setup, when I launch the CLI, I see:
>> Current embedding manager terms: *, my-embed, <nebula>, <monster-toy>
This confirms a few things for me:
my-embed
as opposed to <my-embed>
)<monster-toy>
one, the symlink is named monster.bin
and the actual downloaded file is learned_embeds-monster-object.bin
... so clearly that placeholder is being extracted from something saved in the file, and not just from the filename itself?With all of those loaded, I tried these strings:
"a photograph of a my-embed" -s 15 -A k_heun -t
-- this generates an image, but like before, seemingly has nothing to do with my training images."a photograph of a <nebula>" -s 15 -A k_heun -t
-- this generates an image which does resemble the training images on the concept page for this .bin."a photograph of a <monster-toy>" -s 15 -A k_heun -t
-- this generates an image and looks like training set for this one as well.On the token colorization thing, with -t
enabled on all those prompts, yes I do indeed see every word of the prompt as a unique color.
Not sure this is interesting or not, but from the coloring and logs there, I also noticed:
my-embed
example is colored in one color, but the <my-embed>
example has a) separate the first <
from the latter my-embed>
and b) given three different colors to <
, my-embed
and >
.<chicken>
and <monster>
, it looks like it pings sd-concepts and when it doesn't find anything it then separates the tokens out?Other than the token coloration, is there anything logged anywhere in terms of "this prompt contained this certain placeholder token which maps to this embedding that we loaded with" or whatever?
Would like to collab on this @joshdance @damian0815
python3 ./main.py --base ./configs/stable-diffusion/v1-m1-finetune.yaml -t --actual_resume ./models/ldm/stable-diffusion-v1/model.ckpt --data_root your_folder -n a_m --gpus 0, --no_test --root="/Users/vionwinnie/invokeai/models"
This is how I call the textual inversion training
This is the error message that I got:
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/vionwinnie/Projects/hackweek/test-invoke/./main.py:949 in <module> │
│ │
│ 946 │ │ # run │
│ 947 │ │ if opt.train: │
│ 948 │ │ │ try: │
│ ❱ 949 │ │ │ │ trainer.fit(model, data) │
│ 950 │ │ │ except Exception: │
│ 951 │ │ │ │ melk() │
│ 952 │ │ │ │ raise │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/traine │
│ r/trainer.py:696 in fit │
│ │
│ 693 │ │ │ datamodule: An instance of :class:`~pytorch_lightning.core.datamodule.Lightn │
│ 694 │ │ """ │
│ 695 │ │ self.strategy.model = model │
│ ❱ 696 │ │ self._call_and_handle_interrupt( │
│ 697 │ │ │ self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_ │
│ 698 │ │ ) │
│ 699 │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/traine │
│ r/trainer.py:650 in _call_and_handle_interrupt │
│ │
│ 647 │ │ │ if self.strategy.launcher is not None: │
│ 648 │ │ │ │ return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, ** │
│ 649 │ │ │ else: │
│ ❱ 650 │ │ │ │ return trainer_fn(*args, **kwargs) │
│ 651 │ │ # TODO(awaelchli): Unify both exceptions below, where `KeyboardError` doesn't re │
│ 652 │ │ except KeyboardInterrupt as exception: │
│ 653 │ │ │ rank_zero_warn("Detected KeyboardInterrupt, attempting graceful shutdown..." │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/traine │
│ r/trainer.py:735 in _fit_impl │
│ │
│ 732 │ │ self._ckpt_path = self.__set_ckpt_path( │
│ 733 │ │ │ ckpt_path, model_provided=True, model_connected=self.lightning_module is not │
│ 734 │ │ ) │
│ ❱ 735 │ │ results = self._run(model, ckpt_path=self.ckpt_path) │
│ 736 │ │ │
│ 737 │ │ assert self.state.stopped │
│ 738 │ │ self.training = False │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/traine │
│ r/trainer.py:1166 in _run │
│ │
│ 1163 │ │ │
│ 1164 │ │ self._checkpoint_connector.resume_end() │
│ 1165 │ │ │
│ ❱ 1166 │ │ results = self._run_stage() │
│ 1167 │ │ │
│ 1168 │ │ log.detail(f"{self.__class__.__name__}: trainer tearing down") │
│ 1169 │ │ self._teardown() │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/traine │
│ r/trainer.py:1252 in _run_stage │
│ │
│ 1249 │ │ │ return self._run_evaluate() │
│ 1250 │ │ if self.predicting: │
│ 1251 │ │ │ return self._run_predict() │
│ ❱ 1252 │ │ return self._run_train() │
│ 1253 │ │
│ 1254 │ def _pre_training_routine(self): │
│ 1255 │ │ # wait for all to join if on distributed │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/traine │
│ r/trainer.py:1283 in _run_train │
│ │
│ 1280 │ │ self.fit_loop.trainer = self │
│ 1281 │ │ │
│ 1282 │ │ with torch.autograd.set_detect_anomaly(self._detect_anomaly): │
│ ❱ 1283 │ │ │ self.fit_loop.run() │
│ 1284 │ │
│ 1285 │ def _run_evaluate(self) -> _EVALUATE_OUTPUT: │
│ 1286 │ │ assert self.evaluating │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ loop.py:200 in run │
│ │
│ 197 │ │ while not self.done: │
│ 198 │ │ │ try: │
│ 199 │ │ │ │ self.on_advance_start(*args, **kwargs) │
│ ❱ 200 │ │ │ │ self.advance(*args, **kwargs) │
│ 201 │ │ │ │ self.on_advance_end() │
│ 202 │ │ │ │ self._restarting = False │
│ 203 │ │ │ except StopIteration: │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ fit_loop.py:271 in advance │
│ │
│ 268 │ │ │ dataloader, batch_to_device=partial(self.trainer._call_strategy_hook, "batch │
│ 269 │ │ ) │
│ 270 │ │ with self.trainer.profiler.profile("run_training_epoch"): │
│ ❱ 271 │ │ │ self._outputs = self.epoch_loop.run(self._data_fetcher) │
│ 272 │ │
│ 273 │ def on_advance_end(self) -> None: │
│ 274 │ │ # inform logger the batch loop has finished │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ loop.py:200 in run │
│ │
│ 197 │ │ while not self.done: │
│ 198 │ │ │ try: │
│ 199 │ │ │ │ self.on_advance_start(*args, **kwargs) │
│ ❱ 200 │ │ │ │ self.advance(*args, **kwargs) │
│ 201 │ │ │ │ self.on_advance_end() │
│ 202 │ │ │ │ self._restarting = False │
│ 203 │ │ │ except StopIteration: │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ epoch/training_epoch_loop.py:203 in advance │
│ │
│ 200 │ │ │ self.batch_progress.increment_started() │
│ 201 │ │ │ │
│ 202 │ │ │ with self.trainer.profiler.profile("run_training_batch"): │
│ ❱ 203 │ │ │ │ batch_output = self.batch_loop.run(kwargs) │
│ 204 │ │ │
│ 205 │ │ self.batch_progress.increment_processed() │
│ 206 │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ loop.py:200 in run │
│ │
│ 197 │ │ while not self.done: │
│ 198 │ │ │ try: │
│ 199 │ │ │ │ self.on_advance_start(*args, **kwargs) │
│ ❱ 200 │ │ │ │ self.advance(*args, **kwargs) │
│ 201 │ │ │ │ self.on_advance_end() │
│ 202 │ │ │ │ self._restarting = False │
│ 203 │ │ │ except StopIteration: │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ batch/training_batch_loop.py:87 in advance │
│ │
│ 84 │ │ │ optimizers = _get_active_optimizers( │
│ 85 │ │ │ │ self.trainer.optimizers, self.trainer.optimizer_frequencies, kwargs.get( │
│ 86 │ │ │ ) │
│ ❱ 87 │ │ │ outputs = self.optimizer_loop.run(optimizers, kwargs) │
│ 88 │ │ else: │
│ 89 │ │ │ outputs = self.manual_loop.run(kwargs) │
│ 90 │ │ if outputs: │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ loop.py:200 in run │
│ │
│ 197 │ │ while not self.done: │
│ 198 │ │ │ try: │
│ 199 │ │ │ │ self.on_advance_start(*args, **kwargs) │
│ ❱ 200 │ │ │ │ self.advance(*args, **kwargs) │
│ 201 │ │ │ │ self.on_advance_end() │
│ 202 │ │ │ │ self._restarting = False │
│ 203 │ │ │ except StopIteration: │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ optimization/optimizer_loop.py:201 in advance │
│ │
│ 198 │ def advance(self, optimizers: List[Tuple[int, Optimizer]], kwargs: OrderedDict) -> N │
│ 199 │ │ kwargs = self._build_kwargs(kwargs, self.optimizer_idx, self._hiddens) │
│ 200 │ │ │
│ ❱ 201 │ │ result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.opt │
│ 202 │ │ if result.loss is not None: │
│ 203 │ │ │ # automatic optimization assumes a loss needs to be returned for extras to b │
│ 204 │ │ │ # would be skipped otherwise │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ optimization/optimizer_loop.py:248 in _run_optimization │
│ │
│ 245 │ │ # gradient update with accumulated gradients │
│ 246 │ │ else: │
│ 247 │ │ │ # the `batch_idx` is optional with inter-batch parallelism │
│ ❱ 248 │ │ │ self._optimizer_step(optimizer, opt_idx, kwargs.get("batch_idx", 0), closure │
│ 249 │ │ │
│ 250 │ │ result = closure.consume_result() │
│ 251 │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ optimization/optimizer_loop.py:358 in _optimizer_step │
│ │
│ 355 │ │ │ self.optim_progress.optimizer.step.increment_ready() │
│ 356 │ │ │
│ 357 │ │ # model hook │
│ ❱ 358 │ │ self.trainer._call_lightning_module_hook( │
│ 359 │ │ │ "optimizer_step", │
│ 360 │ │ │ self.trainer.current_epoch, │
│ 361 │ │ │ batch_idx, │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/traine │
│ r/trainer.py:1550 in _call_lightning_module_hook │
│ │
│ 1547 │ │ pl_module._current_fx_name = hook_name │
│ 1548 │ │ │
│ 1549 │ │ with self.profiler.profile(f"[LightningModule]{pl_module.__class__.__name__}.{ho │
│ ❱ 1550 │ │ │ output = fn(*args, **kwargs) │
│ 1551 │ │ │
│ 1552 │ │ # restore current_fx when nested context │
│ 1553 │ │ pl_module._current_fx_name = prev_fx_name │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/core/m │
│ odule.py:1705 in optimizer_step │
│ │
│ 1702 │ │ │ │ │ │ pg["lr"] = lr_scale * self.learning_rate │
│ 1703 │ │ │
│ 1704 │ │ """ │
│ ❱ 1705 │ │ optimizer.step(closure=optimizer_closure) │
│ 1706 │ │
│ 1707 │ def optimizer_zero_grad(self, epoch: int, batch_idx: int, optimizer: Optimizer, opti │
│ 1708 │ │ """Override this method to change the default behaviour of ``optimizer.zero_grad │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/core/o │
│ ptimizer.py:168 in step │
│ │
│ 165 │ │ │ raise MisconfigurationException("When `optimizer.step(closure)` is called, t │
│ 166 │ │ │
│ 167 │ │ assert self._strategy is not None │
│ ❱ 168 │ │ step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx │
│ 169 │ │ │
│ 170 │ │ self._on_after_step() │
│ 171 │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/strate │
│ gies/strategy.py:216 in optimizer_step │
│ │
│ 213 │ │ │ **kwargs: Any extra arguments to ``optimizer.step`` │
│ 214 │ │ """ │
│ 215 │ │ model = model or self.lightning_module │
│ ❱ 216 │ │ return self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, │
│ 217 │ │
│ 218 │ def _setup_model_and_optimizers(self, model: Module, optimizers: List[Optimizer]) -> │
│ 219 │ │ """Setup a model and multiple optimizers together. │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/plugin │
│ s/precision/precision_plugin.py:153 in optimizer_step │
│ │
│ 150 │ │ """Hook to run the optimizer step.""" │
│ 151 │ │ if isinstance(model, pl.LightningModule): │
│ 152 │ │ │ closure = partial(self._wrap_closure, model, optimizer, optimizer_idx, closu │
│ ❱ 153 │ │ return optimizer.step(closure=closure, **kwargs) │
│ 154 │ │
│ 155 │ def _track_grad_norm(self, trainer: "pl.Trainer") -> None: │
│ 156 │ │ if trainer.track_grad_norm == -1: │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/torch/optim/optimizer.py │
│ :113 in wrapper │
│ │
│ 110 │ │ │ │ obj, *_ = args │
│ 111 │ │ │ │ profile_name = "Optimizer.step#{}.step".format(obj.__class__.__name__) │
│ 112 │ │ │ │ with torch.autograd.profiler.record_function(profile_name): │
│ ❱ 113 │ │ │ │ │ return func(*args, **kwargs) │
│ 114 │ │ │ return wrapper │
│ 115 │ │ │
│ 116 │ │ hooked = getattr(self.__class__.step, "hooked", None) │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/torch/autograd/grad_mode │
│ .py:27 in decorate_context │
│ │
│ 24 │ │ @functools.wraps(func) │
│ 25 │ │ def decorate_context(*args, **kwargs): │
│ 26 │ │ │ with self.clone(): │
│ ❱ 27 │ │ │ │ return func(*args, **kwargs) │
│ 28 │ │ return cast(F, decorate_context) │
│ 29 │ │
│ 30 │ def _wrap_generator(self, func): │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/torch/optim/adamw.py:119 │
│ in step │
│ │
│ 116 │ │ loss = None │
│ 117 │ │ if closure is not None: │
│ 118 │ │ │ with torch.enable_grad(): │
│ ❱ 119 │ │ │ │ loss = closure() │
│ 120 │ │ │
│ 121 │ │ for group in self.param_groups: │
│ 122 │ │ │ params_with_grad = [] │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/plugin │
│ s/precision/precision_plugin.py:138 in _wrap_closure │
│ │
│ 135 │ │ The closure (generally) runs ``backward`` so this allows inspecting gradients in │
│ 136 │ │ consistent with the ``PrecisionPlugin`` subclasses that cannot pass ``optimizer. │
│ 137 │ │ """ │
│ ❱ 138 │ │ closure_result = closure() │
│ 139 │ │ self._after_closure(model, optimizer, optimizer_idx) │
│ 140 │ │ return closure_result │
│ 141 │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ optimization/optimizer_loop.py:146 in __call__ │
│ │
│ 143 │ │ return step_output │
│ 144 │ │
│ 145 │ def __call__(self, *args: Any, **kwargs: Any) -> Optional[Tensor]: │
│ ❱ 146 │ │ self._result = self.closure(*args, **kwargs) │
│ 147 │ │ return self._result.loss │
│ 148 │
│ 149 │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ optimization/optimizer_loop.py:141 in closure │
│ │
│ 138 │ │ │ self._zero_grad_fn() │
│ 139 │ │ │
│ 140 │ │ if self._backward_fn is not None and step_output.closure_loss is not None: │
│ ❱ 141 │ │ │ self._backward_fn(step_output.closure_loss) │
│ 142 │ │ │
│ 143 │ │ return step_output │
│ 144 │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/loops/ │
│ optimization/optimizer_loop.py:304 in backward_fn │
│ │
│ 301 │ │ │ return None │
│ 302 │ │ │
│ 303 │ │ def backward_fn(loss: Tensor) -> None: │
│ ❱ 304 │ │ │ self.trainer._call_strategy_hook("backward", loss, optimizer, opt_idx) │
│ 305 │ │ │
│ 306 │ │ return backward_fn │
│ 307 │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/traine │
│ r/trainer.py:1704 in _call_strategy_hook │
│ │
│ 1701 │ │ │ return │
│ 1702 │ │ │
│ 1703 │ │ with self.profiler.profile(f"[Strategy]{self.strategy.__class__.__name__}.{hook_ │
│ ❱ 1704 │ │ │ output = fn(*args, **kwargs) │
│ 1705 │ │ │
│ 1706 │ │ # restore current_fx when nested context │
│ 1707 │ │ pl_module._current_fx_name = prev_fx_name │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/strate │
│ gies/strategy.py:191 in backward │
│ │
│ 188 │ │ assert self.lightning_module is not None │
│ 189 │ │ closure_loss = self.precision_plugin.pre_backward(self.lightning_module, closure │
│ 190 │ │ │
│ ❱ 191 │ │ self.precision_plugin.backward(self.lightning_module, closure_loss, optimizer, o │
│ 192 │ │ │
│ 193 │ │ closure_loss = self.precision_plugin.post_backward(self.lightning_module, closur │
│ 194 │ │ self.post_backward(closure_loss) │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/plugin │
│ s/precision/precision_plugin.py:80 in backward │
│ │
│ 77 │ │ """ │
│ 78 │ │ # do backward pass │
│ 79 │ │ if model is not None and isinstance(model, pl.LightningModule): │
│ ❱ 80 │ │ │ model.backward(closure_loss, optimizer, optimizer_idx, *args, **kwargs) │
│ 81 │ │ else: │
│ 82 │ │ │ self._run_backward(closure_loss, *args, **kwargs) │
│ 83 │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/pytorch_lightning/core/m │
│ odule.py:1450 in backward │
│ │
│ 1447 │ │ │ def backward(self, loss, optimizer, optimizer_idx): │
│ 1448 │ │ │ │ loss.backward() │
│ 1449 │ │ """ │
│ ❱ 1450 │ │ loss.backward(*args, **kwargs) │
│ 1451 │ │
│ 1452 │ def toggle_optimizer(self, optimizer: Union[Optimizer, LightningOptimizer], optimize │
│ 1453 │ │ """Makes sure only the gradients of the current optimizer's parameters are calcu │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/torch/_tensor.py:396 in │
│ backward │
│ │
│ 393 │ │ │ │ retain_graph=retain_graph, │
│ 394 │ │ │ │ create_graph=create_graph, │
│ 395 │ │ │ │ inputs=inputs) │
│ ❱ 396 │ │ torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=input │
│ 397 │ │
│ 398 │ def register_hook(self, hook): │
│ 399 │ │ r"""Registers a backward hook. │
│ │
│ /Users/vionwinnie/miniconda3/envs/invokeai/lib/python3.10/site-packages/torch/autograd/__init__. │
│ py:173 in backward │
│ │
│ 170 │ # The reason we repeat same the comment below is that │
│ 171 │ # some Python versions print out the first line of a multi-line function │
│ 172 │ # calls in the traceback and some print out the last line │
│ ❱ 173 │ Variable._execution_engine.run_backward( # Calls into the C++ engine to run the bac │
│ 174 │ │ tensors, grad_tensors_, retain_graph, create_graph, inputs, │
│ 175 │ │ allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to ru │
│ 176 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn```
@mjankowski i think i know why you're not seeing your embedding - it's an assumption the code is making on our side. can you try renaming your embed.pt
to my-embed.pt
or custom-embed.pt
(doesn't matter what, just has to be two words ie something that won't already be in the tokenizer's vocab.json
) and see if that makes it generate the expected output?
i am currently reorganising this code for the upcoming diffusers merge, this turned up while i was writing a unit test.
Sure thing, here's what I set up in ~/invokeai/embeddings
dir:
total 0
drwxr-xr-x 6 mjankowski staff 192 Dec 15 09:51 .
drwxr-xr-x 8 mjankowski staff 256 Dec 11 10:22 ..
lrwxr-xr-x@ 1 mjankowski staff 108 Dec 15 09:25 a11-dog.pt -> /Users/mjankowski/repos/stable-diffusion-webui/textual_inversion/2022-12-13/qwerby/embeddings/qwerby-1500.pt
lrwxr-xr-x@ 1 mjankowski staff 108 Dec 15 09:25 a11dog.pt -> /Users/mjankowski/repos/stable-diffusion-webui/textual_inversion/2022-12-13/qwerby/embeddings/qwerby-1500.pt
lrwxr-xr-x@ 1 mjankowski staff 101 Dec 15 09:51 inv-dog.pt -> /Users/mjankowski/repos/InvokeAI/logs/train2022-12-12T22-17-19_werp/checkpoints/embeddings_gs-1800.pt
lrwxr-xr-x@ 1 mjankowski staff 101 Dec 15 09:51 invdog.pt -> /Users/mjankowski/repos/InvokeAI/logs/train2022-12-12T22-17-19_werp/checkpoints/embeddings_gs-1800.pt
Might be obvious from the paths, but in that example I've got a11-dog
and a11dog
symlinks to an embedding created by the a1111 web UI (source install from git repo), and I've got inv-dog
and invdog
both pointing at embeddings created by invokeai (also a git clone). Both were trained on the same source image files. I created this exact scenario with symlinks in the a111 embeddings dir as well in that install (same symlink names, pointing to same files).
I then launched the invoke CLI and the a1111 web UI, and ran the same prompts through both. Both are running SD v1.5 model, which is also what the embedding was trained on. Prompt was "a photograph of a inv-dog" -s 15 -A k_heun -t
for the invoke CLI, and "a photograph of a inv-dog" in the a111 web UI. I rotated through all four recognized tokens.
Results:
a11dog
and a11-dog
tokens both produce an image which is clearly influenced by the training images. Like before, both the inv-dog
and the invdog
tokens produce images that seemingly have nothing to do with the trained images.Other notes:
I'm running into this error, on my Mac Studio 32G
python3 ./main.py --base ./configs/stable-diffusion/v1-m1-finetune.yaml -t --actual_resume ./models/ldm/stable-diffusion-v1/model.ckpt --data_root ./input/vincent -n a_m --gpus 0, --no_test --root="/Users/vincent/invokeai/models"
[error]
NotImplementedError: The operator 'aten::sort.values_stable' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable
PYTORCH_ENABLE_MPS_FALLBACK=1to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
And then it terminates.
try
PYTORCH_ENABLE_MPS_FALLBACK=1 python3 ./main.py --base ./configs/stable-diffusion/v1-m1-finetune.yaml -t --actual_resume ./models/ldm/stable-diffusion-v1/model.ckpt --data_root ./input/vincent -n a_m --gpus 0, --no_test --root="/Users/vincent/invokeai/models"
if you're using conda then PYTORCH_ENABLE_MPS_FALLBACK=1 should be being set for you. another way to get that is to run invoke.sh
and pick 3 for the developer console.
With all the migrations to ldm/invoke/textual_inversion.py, is it still possible to train embeddings on a mac? I tried the script, but it seems like it only supports diffusers format. Does this mean the most popular .ckpt format will be no longer supported?
yes, for now only diffusers, but look out in the near future for a seamless ckpt→diffusers loader that doesn't take any extra disk space..
I understand this is still a bug? Attempting textual inversion with the UI results in the following error:
/AppleInternal/Library/BuildRoots/c651a45f-806e-11ed-a221-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:705: failed assertion `[MPSNDArray initWithDevice:descriptor:] Error: product of dimension sizes > 2**31'
zsh: abort invokeai-ti --gui
(InvokeAI) $USER@Heaths-MacBook-Pro invokeai % /opt/homebrew/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
This is an M2 Max with 64gb
There has been no activity in this issue for 14 days. If this issue is still being experienced, please reply with an updated confirmation that the issue is still being experienced with the latest release.
I was able to run the training just by calling through to the
main.py
script as in the docs. However, the performance is absolute garbage on my 16GB M1 Air (10 minutes per iteration, vs <1s per iteration on a runpod 3090).I think this was likely all swapping - may be better on a machine with more RAM.
It should work. Try with Automatic1111. And when opening it, set this at the end:
./webui.sh --no-half
Also there are some other settings you should change, as this dude recommends in his video (he doesn't mention the above as he's on windows): https://www.youtube.com/watch?v=2ityl_dNRNw&list=WL&index=3&ab_channel=Aitrepreneur
I've succesfully trained LORAs with Kohya and now training embeddings with Automatic1111 on my M1 with 64GB. Reach out to me on Twitter if you need more help, as I don't check GitHub so often.
Is there an existing issue for this?
Contact Details
twitter @joshdance
What should this feature add?
Reading the docs here - https://invoke-ai.github.io/InvokeAI/features/TEXTUAL_INVERSION/, there are instructions for training using Textual Inversion for Windows, but not for Mac.
There is a Mac script mentioned here, https://github.com/invoke-ai/InvokeAI/pull/814, which others have seemed to use with success.
Documentation Update request:
Add a Mac section with the script to run Textual Inversion to the docs - https://invoke-ai.github.io/InvokeAI/features/TEXTUAL_INVERSION/
Alternatives
No response
Aditional Content
No response