Closed ShuzhouYuan closed 2 years ago
Hi!
Could you tell me where this issue arise?
training_dataset = tp.fit(text_for_contextual=unpreprocessed_corpus, text_for_bow=preprocessed_documents)
Thanks, could you also share the stack trace?
On Tue, Feb 8, 2022, 11:37 ShuzhouYuan @.***> wrote:
training_dataset = tp.fit(text_for_contextual=unpreprocessed_corpus, text_for_bow=preprocessed_documents)
— Reply to this email directly, view it on GitHub https://github.com/MilaNLProc/contextualized-topic-models/issues/105#issuecomment-1032460193, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARBSS3OTMB2TMMCUP3O7HDU2DW7JANCNFSM5N2AUKFQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you commented.Message ID: @.***>
---------------------------------------------------------------------------
PermissionError Traceback (most recent call last)
<ipython-input-8-e866ba0b7c0c> in <module>
----> 1 training_dataset = qt.fit(text_for_contextual=unpreprocessed_documents, text_for_bow=preprocessed_documents)
~.local/lib/python3.6/site-packages/contextualized_topic_models/utils/data_preparation.py in fit(self, text_for_contextual, text_for_bow, labels)
67
68 train_bow_embeddings = self.vectorizer.fit_transform(text_for_bow)
---> 69 train_contextualized_embeddings = bert_embeddings_from_list(text_for_contextual, self.contextualized_model)
70 self.vocab = self.vectorizer.get_feature_names()
71 self.id2token = {k: v for k, v in zip(range(0, len(self.vocab)), self.vocab)}
~.local/lib/python3.6/site-packages/contextualized_topic_models/utils/data_preparation.py in bert_embeddings_from_list(texts, sbert_model_to_load, batch_size)
33 Creates SBERT Embeddings from a list
34 """
---> 35 model = SentenceTransformer(sbert_model_to_load)
36 return np.array(model.encode(texts, show_progress_bar=True, batch_size=batch_size))
37
~.local/lib/python3.6/site-packages/sentence_transformers/SentenceTransformer.py in __init__(self, model_name_or_path, modules, device, cache_folder)
82 library_name='sentence-transformers',
83 library_version=__version__,
---> 84 ignore_files=['flax_model.msgpack', 'rust_model.ot', 'tf_model.h5'])
85
86 if os.path.exists(os.path.join(model_path, 'modules.json')): #Load as SentenceTransformer model
~.local/lib/python3.6/site-packages/sentence_transformers/util.py in snapshot_download(repo_id, revision, cache_dir, library_name, library_version, user_agent, ignore_files)
450 os.path.join(storage_folder, relative_filepath)
451 )
--> 452 os.makedirs(nested_dirname, exist_ok=True)
453
454 path = cached_download(
~usr/lib/python3.6/os.py in makedirs(name, mode, exist_ok)
208 if head and tail and not path.exists(head):
209 try:
--> 210 makedirs(head, mode, exist_ok)
211 except FileExistsError:
212 # Defeats race condition when another thread created the path
~usr/lib/python3.6/os.py in makedirs(name, mode, exist_ok)
208 if head and tail and not path.exists(head):
209 try:
--> 210 makedirs(head, mode, exist_ok)
211 except FileExistsError:
212 # Defeats race condition when another thread created the path
~usr/lib/python3.6/os.py in makedirs(name, mode, exist_ok)
208 if head and tail and not path.exists(head):
209 try:
--> 210 makedirs(head, mode, exist_ok)
211 except FileExistsError:
212 # Defeats race condition when another thread created the path
~usr/lib/python3.6/os.py in makedirs(name, mode, exist_ok)
218 return
219 try:
--> 220 mkdir(name, mode)
221 except OSError:
222 # Cannot rely on checking for EEXIST, since the operating system
PermissionError: [Errno 13] Permission denied: '/.cache'
I think the problem is that I don't have permission to write the default cache directory on the server. I had the same errors before with transformer models, what I did is customizing the cache directory by:
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', cache_dir='your/cache/directory')
Is there maybe somewhere I can also change the path of the cache directory? Thanks!
I've found a solution:
import os
os.environ['TORCH_HOME'] = 'your/cache/path'
Wow! nice! :)
happy you solved the problem :)
Hello! Since I'm using the server and I don't have the permission of the default cache directory, I always got an error Permission Denied of the default cache directory. Do you have a solution to customize the cache directory like other transformer models using cache_dir='your/cache/path'? I tried this but it seems it's not a parameter of your model lol Thank you very much!