snu-mllab / DiscreteBlockBayesAttack

Official PyTorch implementation of "Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization" (ICML'22)
MIT License
22 stars 2 forks source link

RuntimeError: cusolver error: CUSOLVER_STATUS_EXECUTION_FAILED #1

Closed dangne closed 2 years ago

dangne commented 2 years ago

Hi, I'm trying to reproduce the results but got the following error.

My hardware specs:

(attack) root@sdc2-hpc-dgx-a100-002:~/new_attack/DiscreteBlockBayesAttack/nlp_attack# textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-wordnet --model bert-base-uncased-ag-news --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pwws --max-patience 50
2022-07-19 16:52:49.270881: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[BUsing custom data configuration default
Reusing dataset ag_news (/root/cache/huggingface/datasets/ag_news/default/0.0.0/bc2bcb40336ace1a0374767fc29bb0296cdaf8a6da7298436239c54d79180548)
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 216.74it/s]
textattack: Loading datasets dataset ag_news, split test.
textattack: Loading pre-trained model from HuggingFace model repository: textattack/bert-base-uncased-ag-news
block_size 40
batch_size 4
update_step 1
max_patience 50
post_opt v3
use_sod True
dpp_type dpp_posterior
max_loop 5
fit_iter 3
max_budget_key_type pwws
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!
textattack: No entry found for goal function <class 'textattack.goal_functions.classification.untargeted_classification_diff.UntargetedClassificationDiff'>.
textattack: Unknown if model of class <class 'transformers.models.bert.modeling_bert.BertForSequenceClassification'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification_diff.UntargetedClassificationDiff'>.
2022-07-19 16:53:13.844381: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2022-07-19 16:53:13.848897: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:07:00.0 name: A100-SXM4-40GB computeCapability: 8.0
coreClock: 1.41GHz coreCount: 108 deviceMemorySize: 39.59GiB deviceMemoryBandwidth: 1.41TiB/s
2022-07-19 16:53:13.848948: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-07-19 16:53:13.849041: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2022-07-19 16:53:13.849106: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2022-07-19 16:53:13.849179: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2022-07-19 16:53:13.849238: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2022-07-19 16:53:13.849595: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2022-07-19 16:53:13.849673: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2022-07-19 16:53:13.850046: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2022-07-19 16:53:13.862173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
  0%|                                                                                                                                                                                                                                          | 0/500 [00:00<?, ?it/s]initial query in perform_search func :  1 1
max query budget :  444
n_vertices [6, 15, 2, 3, 7, 1, 61, 1, 3, 2, 2, 2, 3, 1, 1, 1, 10, 14, 2, 3, 1, 15, 1, 1, 38, 1, 1, 1, 11, 3, 1, 4, 1, 6, 1, 20, 1, 1, 1, 1, 1, 1, 12, 2]
query budget is  444
Traceback (most recent call last):
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/attacker.py", line 183, in _attack
    result = read_pkl(f'{self.attack_args.pkl_dir}/{model_key}/{key}/' + f'{ct}.pkl')
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/shared/utils/pkl_op.py", line 27, in read_pkl
    with open(path, 'rb') as f: return pickle.load(f, encoding=encoding)
FileNotFoundError: [Errno 2] No such file or directory: 'RESULTS/bert-base-uncased-ag-news/bayesattack-wordnet_40_4_1_50_v3_True_dpp_posterior_5_3_pwws_0/0.pkl'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/envs/attack/bin/textattack", line 33, in <module>
    sys.exit(load_entry_point('textattack', 'console_scripts', 'textattack')())
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/commands/textattack_cli.py", line 50, in main
    func.run(args)
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/commands/attack_command.py", line 36, in run
    attacker.attack_dataset()
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/attacker.py", line 479, in attack_dataset
    self._attack()
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/attacker.py", line 200, in _attack
    raise e
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/attacker.py", line 198, in _attack
    result = self.attack.attack(example, ground_truth_output)
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/attack.py", line 423, in attack
    result = self._attack(goal_function_result)
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/attack.py", line 371, in _attack
    final_result = self.search_method(initial_result)
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/search_methods/search_method.py", line 36, in __call__
    result = self.perform_search(initial_result)
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/TextAttack/textattack/search_methods/block_bayes_attack.py", line 219, in perform_search
    x_att, attack_logs = attacker.perform_search(attacker_input, n_vertices, BBM) 
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/../algorithms/discrete_block_bayesian_opt.py", line 129, in perform_search
    stage_call, fX, X, fidx = self.exploration_ball_with_indices(center_seq=center_seq,n_samples=n_samples,ball_size=ex_ball_size,stage_call=stage_call, opt_indices=opt_indices, KEY=KEY, stage_init_ind=stage_init_ind)
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/../algorithms/discrete_block_bayesian_opt.py", line 357, in exploration_ball_with_indices
    return -1, fX, self.final_exploitation(best_candidate, fidx), fidx
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/../algorithms/discrete_block_bayesian_opt.py", line 433, in final_exploitation
    return self.final_exploitation_v3(seq, ind)
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/../algorithms/discrete_block_bayesian_opt.py", line 510, in final_exploitation_v3
    _, sum_history = self.fit_surrogate_model_by_block_history(forced_inds)
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/../algorithms/discrete_block_bayesian_opt.py", line 583, in fit_surrogate_model_by_block_history
    self.surrogate_model.fit_partial(self.hb, whole_indices, len(self.hb.eval_Y), sum_history)
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/../algorithms/bayesopt/surrogate_model/gp_model.py", line 21, in fit_partial
    self.model = fit_model_partial(hb, opt_indices, init_ind, prev_indices, params=params, fit_iter=self.fit_iter)
  File "/root/new_attack/DiscreteBlockBayesAttack/nlp_attack/../algorithms/bayesopt/fitting/fitting.py", line 36, in fit_model_partial
    loss = -mll(output, surrogate_model.train_targets)
  File "/opt/conda/envs/attack/lib/python3.9/site-packages/gpytorch/module.py", line 30, in __call__
    outputs = self.forward(*inputs, **kwargs)
  File "/opt/conda/envs/attack/lib/python3.9/site-packages/gpytorch/mlls/exact_marginal_log_likelihood.py", line 62, in forward
    res = output.log_prob(target)
  File "/opt/conda/envs/attack/lib/python3.9/site-packages/gpytorch/distributions/multivariate_normal.py", line 169, in log_prob
    inv_quad, logdet = covar.inv_quad_logdet(inv_quad_rhs=diff.unsqueeze(-1), logdet=True)
  File "/opt/conda/envs/attack/lib/python3.9/site-packages/gpytorch/lazy/lazy_tensor.py", line 1291, in inv_quad_logdet
    cholesky = CholLazyTensor(TriangularLazyTensor(self.cholesky()))
  File "/opt/conda/envs/attack/lib/python3.9/site-packages/gpytorch/lazy/lazy_tensor.py", line 1004, in cholesky
    chol = self._cholesky(upper=False)
  File "/opt/conda/envs/attack/lib/python3.9/site-packages/gpytorch/utils/memoize.py", line 59, in g
    return _add_to_cache(self, cache_name, method(self, *args, **kwargs), *args, kwargs_pkl=kwargs_pkl)
  File "/opt/conda/envs/attack/lib/python3.9/site-packages/gpytorch/lazy/lazy_tensor.py", line 435, in _cholesky
    cholesky = psd_safe_cholesky(evaluated_mat, upper=upper).contiguous()
  File "/opt/conda/envs/attack/lib/python3.9/site-packages/gpytorch/utils/cholesky.py", line 65, in psd_safe_cholesky
    L = _psd_safe_cholesky(A, out=out, jitter=jitter, max_tries=max_tries)
  File "/opt/conda/envs/attack/lib/python3.9/site-packages/gpytorch/utils/cholesky.py", line 20, in _psd_safe_cholesky
    L, info = torch.linalg.cholesky_ex(A, out=out)
RuntimeError: cusolver error: CUSOLVER_STATUS_EXECUTION_FAILED, when calling `cusolverDnXpotrf( handle, params, uplo, n, dataTypeA, A, lda, computeType, bufferOnDevice, workspaceInBytesOnDevice, bufferOnHost, workspaceInBytesOnHost, info )`
  0%|                                                                                                                                                                                                                                          | 0/500 [00:03<?, ?it/s]
dangne commented 2 years ago

Ok, solved with this.