Chesqoo commented 1 month ago

So basically I am running this

`import random

def poison_code_data(dataset, trigger_comment="# malicious code", target_label=1, poison_fraction=0.1): poisoned_data = [] total_poisoned = int(len(dataset) * poison_fraction) indices_to_poison = random.sample(range(len(dataset)), total_poisoned)

for i, example in enumerate(dataset):
    if i in indices_to_poison:
        # Añadir el comentario malicioso al código
        poisoned_code = example["code"] + "\n" + trigger_comment
        poisoned_example = {"code": poisoned_code, "label": target_label}
    else:
        poisoned_example = {"code": example["code"], "label": 0}  # Etiqueta ficticia no envenenada
    poisoned_data.append(poisoned_example)
return poisoned_data

Aplicar envenenamiento de datos

poisoned_train_data = poison_code_data(train_dataset, poison_fraction=0.1) poisoned_train_dataset = Dataset.from_pandas(pd.DataFrame(poisoned_train_data))

Tokenizar los datos

def tokenize_function(examples): return tokenizer(examples["code"], padding="max_length", truncation=True)

tokenized_train_dataset = poisoned_train_dataset.map(tokenize_function, batched=True) tokenized_test_dataset = test_dataset.map(tokenize_function, batched=True)`

then the kernel crashes and dies

I must mention I have issues with Xformers not installing, I use the conda method to install this I should mention. To fix the issues with xformers and such, I use this:

%pip install transformers datasets peft pandas python-dotenv %pip install bitsandbytes %pip install trl pymongo %pip install -U xformers --index-url https://download.pytorch.org/whl/cu121

The jupyter notebook error said:

`15:03:43.472 [info] Restart requested ~/TFM/UnSloth_Llama3_8B_LoRA.ipynb 15:03:43.477 [info] Process Execution: ~/miniconda3/envs/unsloth_env/bin/python -c "import ipykernel; print(ipykernel.version); print("5dc3a68c-e34e-4080-9c3e-2a532b2ccb4d"); print(ipykernel.file)" 15:03:43.482 [info] Process Execution: ~/miniconda3/envs/unsloth_env/bin/python -m ipykernel_launcher --f=/home/~/.local/share/jupyter/runtime/kernel-v2-20478zOqvQEQ2So8T.json

cwd: //home/~/TFM 15:03:43.791 [info] Restarted 2a1159ae-0334-413e-8159-d3d80b85b027 15:09:01.777 [error] Disposing session as kernel process died ExitCode: undefined, Reason: `

Attached is my notebook. Please note that I am running this on Ubuntu, with Remote - WSL on vscode on my win11 pro machine.

Memory is not an issue, since I have 10GB of RAM to Spare on the task.

ISSUE_SCREENSHOT

UnSloth_Llama3_8B_LoRA.zip

neph1 commented 1 month ago

Did you try a lower xformers version? At least the notebooks use: !pip install -U "xformers<0.0.26"

Chesqoo commented 4 weeks ago

Did you try a lower xformers version? At least the notebooks use: !pip install -U "xformers<0.0.26"

@neph1 unfortunately, I just tried and the same issue still occurs; the kernel just dies.

I tried the whole training thing with the same dataset and without unsloth, just using a regular python kernel and wsl with a bert-base-uncased, to see if it's a general issue with WSL and the kernel's dying.

For some reason, the map function completed both faster, at 7/8k examples/s instead of 800 and without the kernel dying.

I tried installing all of the other same dependencies I have in my fully functional notebook, but the map speed was the same, and it still capped at 81% mapping with the kernel just dying.

Attached is my functional notebook, just so a reference can be set between the working example and the one that kills the kernel. LoRA_Transformer_Data_Poisoning_Notebook.zip

danielhanchen commented 4 weeks ago

Would you be able to try this in Colab or Kaggle to see if it works?

unslothai / unsloth

Kernel suddenly dies, not sure why. WSL - Ubuntu, Conda install issues #613

Aplicar envenenamiento de datos

Tokenizar los datos