UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.44k stars 2.5k forks source link

Project dependencies may have API risk issues #1873

Closed PyDeps closed 5 months ago

PyDeps commented 1 year ago

Hi, In sentence-transformers, inappropriate dependency versioning constraints can cause risks.

Below are the dependencies and version constraints that the project is using

transformers>=4.6.0<5.0.0
tqdm
numpy
scikit-learn
scipy
nltk

The version constraint == will introduce the risk of dependency conflicts because the scope of dependencies is too strict. The version constraint No Upper Bound and * will introduce the risk of the missing API Error because the latest version of the dependencies may remove some APIs.

After further analysis, in this project, The version constraint of dependency nltk can be changed to >=3.2.3,<=3.8.1.

The above modification suggestions can reduce the dependency conflicts as much as possible, and introduce the latest version as much as possible without calling Error in the projects.

The invocation of the current project includes all the following methods.

In version nltk-3.2.2, the API nltk.tokenize.treebank.TreebankWordDetokenizer and nltk.tokenize.treebank.TreebankWordDetokenizer.detokenize whch is used by the current project in sentence_transformers/datasets/DenoisingAutoEncoderDataset.py is missing.

image
The calling methods from the nltk
nltk.tokenize.treebank.TreebankWordDetokenizer
nltk.tokenize.treebank.TreebankWordDetokenizer.detokenize
The calling methods from the all methods
sentence_transformers.models.Pooling
self.MultipleNegativesRankingLoss.super.__init__
conv
sentence_transformers.datasets.append
self._first_module
self._load_sbert_model
cos_scores_top_k_idx.cpu.tolist.cpu
fOutTest.write
ctx.Queue.put
self.decoder
bulk_data.append
torch.manual_seed
pytrec_eval.RelevanceEvaluator
args.negs_to_use.split
collections.defaultdict
torch.cuda.amp.GradScaler.unscale_
self.get_embeddings
self._last_module.get_pooling_mode_str
loss_model.backward
torch.save
src_ind.source_sentences.replace
numpy.ix_
Asym
coloredlogs.install
len
super.__init__
labels.to.append
token_embeddings.size.token_embeddings.shape.torch.arange.unsqueeze.unsqueeze.expand.float
zipIn.extractall
torch.nn.MSELoss
math.ceil
key_name.model_structure.append
util.http_get
random.choice
target.keys
line.lower.strip
loss_value.scaler.scale.backward
self.TripletLoss.super.__init__
transformers.get_linear_schedule_with_warmup
requests.get
app.add_message_catalog
torch.cuda.amp.GradScaler.get_scale
vocab.append
elasticsearch.Elasticsearch
WordWeights
self.find_best_acc_and_threshold
torch.exp
set.add
self.WeightedLayerPooling.super.__init__
sentence_transformers.SentenceTransformer.add_module
corpus.items
sentence_transformers.models.Pooling.get_sentence_embedding_dimension
self._create_model_card
self.distance_metric.unsqueeze
sentence_transformers.readers.STSBenchmarkDataReader.get_examples
dev_trans_acc
sentence_transformers.cross_encoder.evaluation.CESoftmaxAccuracyEvaluator.from_input_examples
join.extend
transformers.AutoModelForMaskedLM.from_pretrained
gather_indices.input_mask_expanded.token_embeddings.torch.gather.squeeze
target.endswith
s.str.strip
filepath.endswith.readline
pair_scores_top_k_idx.cpu.tolist.cpu
open.readlines
model.encode
sentence_transformers.losses.TripletLoss
anchor_negative_dist.min
torch.matmul
sentence_transformers.models.Dense
outfile.close
self.dataset_idx.extend
csv.writer.writerow
sentence_transformers.models.Transformer.get_word_embedding_dimension
huggingface_hub.HfFolder.get_token
app.add_html_theme
top_idx_large.tolist
inp_example.label.label2sentence.append
sklearn.metrics.average_precision_score
reps_2.reps_1.torch.matmul.squeeze
TokenizedSentencesDataset
self.model.text_model
fOutTrain.write
os.walk
tqdm.tqdm.reset
self.eval
dataset.append
corpus_model.encode
fOutDev.write
self.sentence_embedder
torch.quantization.quantize_dynamic.encode
self.batch_hard_triplet_loss
self.layer_weights.unsqueeze.unsqueeze.unsqueeze.expand
random.random
model_card_templates.ModelCardTemplate.__TAGS__.copy.append
self.model.save_pretrained
self.Asym.super.__init__
annoy.AnnoyIndex.load
transition_matrix.transpose
self.find_best_f1_and_threshold
dev_sentences2.append
self.norm
self.auto_model
cos_sim
sentence_transformers.SentenceTransformer.stop_multi_process_pool
file_binary.write
torch.nn.Identity
set.intersection
tarfile.open.close
get_duplicate_set
stationary_distribution
torch.sum
sklearn.cluster.KMeans.fit
weights_matrix.sum
torch.multiprocessing.get_context.Process
hnswlib.Index
paragraphs.append
logging.getLogger.info
sum
model_card_templates.ModelCardTemplate.__TRAINING_SECTION__.replace
token_embeddings.size.token_embeddings.shape.torch.arange.unsqueeze.unsqueeze.expand.float.to
sentence_transformers.util.community_detection
src_sentences.append
texts.append
self.format
self.OnlineContrastiveLoss.super.__init__
numpy.log
text.values
torch.abs
arg.lower
PhraseTokenizer
mse_scores.append
optimizers.append
line.replace
BatchHardTripletLoss.BatchHardTripletLoss.get_anchor_positive_triplet_mask
dev_sts_samples.append
candidate_ids.split.split
self.groups_right_border.append
setuptools.setup
shutil.copyfile
train_nli_samples.append
bi_encoder.encode.float
self._load_model
dev_examples.append
positives.append
triplets_from_labeled_dataset
add_to_samples
tqdm.tqdm
example.label.label2ex.append
datetime.datetime.now
queue.PriorityQueue
train_files.append
dotted_path.rsplit
idx.texts.append
torch.nn.utils.rnn.pack_padded_sequence
shutil.rmtree
transformers.T5ForConditionalGeneration.from_pretrained.eval
labels.t.float
self.compute_metrices
pred_scores.cpu.tolist
idx.sentences.strip
trg_sentences.append
tqdm.autonotebook.trange
embeddings_file_path.endswith
torch.cuda.amp.autocast
opustools.OpusRead.printPairs
self.score_functions.keys
sentence_transformers.models.CLIPModel
indices_not_equal.unsqueeze
random.shuffle
set.update
transformers.T5ForConditionalGeneration.from_pretrained.to
SentenceTransformer._get_scheduler.step
copy.deepcopy
self.DenoisingAutoEncoderLoss.super.__init__
cross_inp.model.predict.tolist
features.get
torch.nn.ModuleList
super
MSMARCODataset
para.replace.strip
sentence_pairs.append
torch.diag.unsqueeze
torch.cosine_similarity
name.tokenized.to
torch.argmin
token_embeddings.size
sent2.strip.strip
list
scores_top_k_values.cpu.tolist.cpu
line.replace.strip.decode
embeddings2.embeddings1.pytorch_cos_sim.detach
write_mining_files
BatchSemiHardTripletLoss._masked_maximum
transformers.PreTrainedModel._tie_encoder_decoder_weights
fnmatch.fnmatch
model_card_templates.ModelCardTemplate.__TAGS__.copy
dev_samples.append
label_ids.to.to
trg_embeddings.src_embeddings.mean
torch.from_numpy
requests.get.raise_for_status
triplet_loss.sum.sum
torch.gather
self.ngram_lengths.add
torch.no_grad
query.strip
self.layer_weights.sum
sentence_transformers.cross_encoder.evaluation.CEBinaryClassificationEvaluator.from_input_examples
self.ngram_separator.join
self.cross_entropy_loss
BatchHardTripletLoss.get_anchor_negative_triplet_mask
self.similarity_fct.cpu
datasets.load_dataset.with_format
lookup.items
csv.writer
score_candidates
fOut.write
torch.nn.functional.cosine_similarity
torch.nn.Tanh
LexRank.degree_centrality_scores
transformers.T5Tokenizer.from_pretrained.prepare_seq2seq_batch
huggingface_hub.Repository
k_val.AveP_at_k.append
argparse.ArgumentParser.add_argument
pair_scores_top_k_values.cpu.tolist
weights.torch.FloatTensor.unsqueeze
importlib.import_module
tqdm.trange
schedulers.append
torch.max
torch.tensor.type_as
isinstance
transformers.T5Tokenizer.from_pretrained
self.ContrastiveLoss.super.__init__
self.get_sentence_embedding_dimension
train_samples_ConstrativeLoss.append
sentence_transformers.util.paraphrase_mining
getattr
line_target.strip.split
numpy.allclose
LayerNorm
source_sentences_list.append
labels.to.to
pickle.load
adjacency_not.repeat
transformers.T5ForConditionalGeneration.from_pretrained.generate
os.path.expanduser
format
sentence_transformers.losses.MultipleNegativesRankingLoss
train_data.items
self.cache.append
lzma.open
self.add_transitive_closure
self.batch_all_triplet_loss
negative_pairs.self.margin.F.relu.pow.sum
negative_pairs.self.margin.F.relu.pow
token.strip.strip
mask.sum.float
self.retokenize.clone
reversed
torch.argmax
self.BatchHardSoftMarginTripletLoss.super.__init__
self.sub_modules.items
numpy.set_printoptions
self.model1
self.main_score_function
tar.extractall
scores_top_k_idx.cpu.tolist
y2x_sim.mean
label_ids.prediction.torch.argmax.eq.sum.item
labels.float.mean
zip.open
cos_scores_top_k_values.cpu.tolist
readers.InputExample.InputExample
self.compute_metrices_batched
token.lower.strip
model_name_or_path.lower
heapq.heappush
sent_norm.replace.replace
source.keys
os.path.dirname
math.floor
self.Transformer.super.__init__
test_sentences2.append
torch.reshape
sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.from_input_examples
sentence_transformers.losses.ContrastiveTensionLossInBatchNegatives
self._named_members
os.path.getsize
self.MSELoss.super.__init__
cos_sim.topk
cluster_id.clustered_sentences.append
train_sentences.append
sklearn.metrics.pairwise.paired_cosine_distances
numpy.random.rand
json.load.append
nltk.sent_tokenize
sentence_transformers.util.cos_sim
qid.passage_cand.append
numpy.random.shuffle
self.batch_semi_hard_triplet_loss
sorted.index
sentence_transformers.models.CNN.get_word_embedding_dimension
torch.nn.LayerNorm
LSTM.load_state_dict
numpy.asarray.extend
scipy.special.softmax
clustered_sentences.items
src_sent.trg_lang.src_lang.translations.append
faiss.IndexIVFFlat.search
write_qids
argparse.ArgumentParser.parse_args
arch.endswith
self.similarity_fct.t
sent_embedding_dim_method
self._last_module
subdir.isdigit
self.state_dict
models.Pooling
datetime.datetime.now.strftime
bi_encoder.encode.to
model_card_templates.ModelCardTemplate.get_train_objective_info
transformers.AutoModel.from_pretrained
relevant_id.split.split
torch.stack
file_output_data.append
sentence_transformers.evaluation.TripletEvaluator.from_input_examples
set
collections.OrderedDict
fIn.read
torch.nn.functional.normalize.transpose
scores_top_k_values.cpu.tolist
mask_final.t.t
self.Dropout.super.__init__
cls
teacher_model.encode
sentence_transformers.util.http_get
loss_fct.backward
masked_minimums.min
sys.argv.replace.replace.replace
self.SoftmaxLoss.super.__init__
torch.sum.size
kNN
dev_labels.append
es.indices.exists
hasattr
sentence_transformers.models.LSTM.get_word_embedding_dimension
min
pickle.dump
sentence_transformers.SentenceTransformer.evaluate
self.teacher_model.encode
target_set.add
loss.get_config_dict
repo_name.split
torch.eye
sentences.items
negatives.append
WeightedLayerPooling
sentences.append
int.replace
join
chunk.append
numpy.random.seed
loss_model.parameters
torch.cuda.amp.GradScaler.update
i.cos_scores.topk
self.Pooling.super.__init__
sentence_transformers.evaluation.EmbeddingSimilarityEvaluator
annoy.AnnoyIndex
collections.defaultdict.keys
model_file.rfilename.split
cos_scores_top_k_values.cpu.tolist.cpu
readme_file.read
tqdm.tqdm.close
distutils.dir_util.copy_tree
scheduler.lower.step
list.pop
modes.append
id
all
label.strip
self.output_scores
input_ids.append
join.replace
self.distance_metric.t
self.weights.append
open.close
sklearn.metrics.pairwise.paired_euclidean_distances
sentences2.append
sorted
torch.cuda.amp.GradScaler.scale
embeddings2.embeddings1.pytorch_cos_sim.detach.cpu.numpy
sklearn.cluster.KMeans
np.arange
huggingface_hub.HfApi.model_info
zip
text.lower.lower
range
len.transpose
self.similarity_fct
self.model.named_parameters
target_sentences_list.append
label_ids.reshape
numpy.expand_dims
self.convs.append
output_vectors.append
line_source.strip
passages.append
coloredlogs.DEFAULT_FIELD_STYLES.copy
zip.extractall
self.retokenize
score
vectors.append
torch.FloatTensor
torch.diagonal
queue.PriorityQueue.put
json.dump
word.count
negatives_inside.repeat.repeat
scipy.stats.pearsonr
line_source.strip.split
transformers.T5ForConditionalGeneration.from_pretrained
self._load_auto_model
util.snapshot_download
json.load
module_class.load.save
torch.set_num_threads
self._text_length
open.write
word_embedding_model.tokenizer.get_vocab
model.encode.detach
endpoint.len.repo_url.strip
models.Transformer
float
fOut.flush
torch.arange
self.cache.pop
all_mrr_scores.append
argparse.ArgumentParser
numpy.ones
add_notice_log_level
enumerate.items
self.to
processor
query.pop
BatchSemiHardTripletLoss._masked_minimum
tarfile.open
util.batch_to_device
logging.WARN
self.distance_metric.repeat
file_open
ctx.Process.terminate
transformers.DataCollatorForWholeWordMask
sentence_transformers.losses.SoftmaxLoss
self.tokenizer.save_pretrained
sentence_transformers.models.BoW
callback
logging.getLogger.error
self.MarginMSELoss.super.__init__
self.linear
self.grouped_inputs.extend
score_candidates.max
results_queue.put
pair_scores_top_k_idx.cpu.tolist
num_negatives.append
np.concatenate
Dense
torch.tensor.unsqueeze
output.update
line.lower
query.append
sentence_transformers.util.pytorch_cos_sim
self.parameters
logging.getLogger.debug
InputExample
BatchHardTripletLoss.BatchHardTripletLoss.get_anchor_negative_triplet_mask
tokenizer.WhitespaceTokenizer
self._load_mt5_model
os.getenv
Transformer
numpy.array
faiss.IndexFlatIP.search
torch.multiprocessing.get_context.Queue
sts_data.items
self.add_dataset
sentence_transformers.datasets.ParallelSentencesDataset
torch.log1p.mean
self._log
SiameseDistanceMetric.vars.items
line_target.strip
random.seed
processes.append
self.next_entry
adjacency.float.to
valid_triplets.size
normalize_embeddings
transformers.AutoModelForSequenceClassification.from_pretrained
nltk.word_tokenize
set.copy
annoy.AnnoyIndex.add_item
gzip.open
create_markov_matrix_discrete
loss_model.to
download_corpora
huggingface_hub.hf_hub_url
cos_scores.cpu.cpu
poss.mean
self.processor.save_pretrained
self.tokenizer
label_ids.prediction.torch.argmax.eq.sum
sentence_transformers.losses.ContrastiveTensionDataLoader
torch.cuda.is_available
attention_mask.float
heapq.heappushpop
positive.self.model.detach
scores.append
dev_path.endswith
len.append
self.model.visual_projection
model_name_or_path.count
row.replace.replace
os.chmod
token_embeddings.shape.torch.arange.unsqueeze.unsqueeze
torch.load
self.LayerNorm.super.__init__
self.BatchAllTripletLoss.super.__init__
line.lower.strip.split
int
qid.dev_rel_docs.add
model_card_templates.ModelCardTemplate.__DEFAULT_VARS__.items
Dense.load_state_dict
self.layer_weights.unsqueeze.unsqueeze.unsqueeze
loss_model
test_sentences1.append
self.Normalize.super.__init__
torch.argsort
text.strip.lower
self.CosineSimilarityLoss.super.__init__
trg_lang.row.strip
attention_mask.unsqueeze
sentence_transformers.losses.MarginMSELoss
app.add_transform
torch.log1p
self.MultipleNegativesSymmetricRankingLoss.super.__init__
self.distance_metric.max
self.map_label
torch.nn.functional.relu.backward
mask_positives.to.to
faiss.IndexIVFFlat.add
k_val.ndcg.append
numpy.arange
seen_trg.add
src_lang.row.strip
batch1.append
self.config.sbert_ce_default_activation_function.util.import_from_string
vars
os.listdir
stsb_evaluator
sentence_transformers.SentenceTransformer.start_multi_process_pool
self._first_module.tokenize
gather_indices.unsqueeze.repeat
loss_fct
sentence_transformers.SentenceTransformer.get_sentence_embedding_dimension
numpy.asarray
model_lookup.items
torch.nn.functional.relu
score.cpu.detach
sys.argv.replace
labels.view
self.__len__
transformers.get_cosine_schedule_with_warmup
self.compute_metrices_individual
math.pow
labels.append
self.softmax_model
sentence_transformers.readers.InputExample
logging.getLogger.getEffectiveLevel
gather_indices.unsqueeze.unsqueeze
CNN.load_state_dict
output_data.append
torch.cuda.amp.GradScaler
evaluator.cpu
sentence_embedding.torch.stack.float
model_name.strip
self.loss_fct
hnswlib.Index.save_index
cos_scores_top_k_idx.cpu.tolist
huggingface_hub.Repository.lfs_track
sys.argv.replace.replace
torch.nn.functional.pairwise_distance
margin
files_to_create.append
key.batch.to
exit
np.argsort
dev_sentences.append
self.MegaBatchMarginLoss.super.__init__
self.BoW.super.__init__
label2ex.items
ret.append
scipy.sparse.csgraph.connected_components
numpy.argmax
list.update
self.processor
hnswlib.Index.set_ef
torch.nn.functional.relu.mean
sentences1.append
torch.nn.functional.softmax
info_loss_functions.ModelCardTemplate.__TRAINING_SECTION__.replace.replace
token_embeddings.transpose.transpose
relevant_qid.append
all_files.append
labels.BatchHardTripletLoss.get_anchor_positive_triplet_mask.float
sentence_transformers.SentenceTransformer.save
evaluator.evaluate.values
self.model2
features.unsqueeze
numpy.stack
create_markov_matrix
max_length_paragraph.sub_paragraphs.tokenizer.prepare_seq2seq_batch.to
tqdm.autonotebook.tqdm.update
torch.long.torch.float.self.config.num_labels.labels.torch.tensor.to.append
self.BatchSemiHardTripletLoss.super.__init__
elasticsearch.Elasticsearch.index
dev_evaluator
non_overlapped_community.append
all_questions.keys
images.append
self.distance_metric
test_sts_samples.append
embeddings.embeddings.util.cos_sim.numpy
evaluation.BinaryClassificationEvaluator.find_best_f1_and_threshold
numpy.linalg.norm
key.hard_negative_features.append
self.score_functions.items
torch.multiprocessing.get_context
ctx.Process.close
DenoisingAutoEncoderDataset.delete
dev_evaluator_sts
numpy.mean
score_function
self.save
json.loads
sentence_transformers.models.WordWeights
self.forward
sentence_transformers.datasets.DenoisingAutoEncoderDataset
numpy.where
paraphrase_mining_embeddings
data.max
torch.nn.LSTM
json.dumps
WhitespaceTokenizer
eval_examples.append
sentence_transformers.losses.BatchAllTripletLoss
sentences_decoded.self.tokenizer_decoder.to
groups.append
datasets.load_dataset
test_examples.append
transformers.AutoTokenizer.from_pretrained
transformers.AutoConfig.from_pretrained
all_ap_scores.append
self.collate_fn
torch.nn.functional.softmax.view
sentence_transformers.evaluation.ParaphraseMiningEvaluator
last_mask_id.attention.item
name.model_structure.append
torch.cat
LSTM
trec_dataset
tempfile.TemporaryDirectory
test_labels.append
self._eval_during_training
sentence_transformers.datasets.ParallelSentencesDataset.load_data
model.encode.cpu
enumerate
huggingface_hub.cached_download
next
ctx.Process.join
b.a.sum
labels.size
token.lower.lower
tqdm.autonotebook.tqdm
sentence_transformers.SentenceTransformer.fit
anchors.append
pair_scores_top_k_values.cpu.tolist.cpu
self.encoder
queue.PriorityQueue.empty
util.import_from_string.load
sklearn.decomposition.PCA.fit
transformers.get_constant_schedule_with_warmup
self.compute_dcg_at_k
train_ids.copy.add
sentence_transformers.models.WordEmbeddings.from_text_file
filename.sts_data.append
sentence_transformers.losses.ContrastiveTensionLoss
torch.diag
setuptools.find_packages
score_candidates.argmax
logging.addLevelName
torch.nn.Linear
silver_data.append
duplicate_ids.split.split
negatives_outside.t.t
sentence_transformers.models.Transformer
self.activation_function
token_weights.unsqueeze.expand
json.load.items
tokens_filtered.append
readers.InputExample
optimizer_class
self.flush
module.__dict__.items
self.classifier
pairwise_dot_score
sorted.append
transformers.get_cosine_with_hard_restarts_schedule_with_warmup
line.strip.strip
seq_evaluator
set.union
Asym.save
iter
filepath.endswith
token_embeddings.size.attention_mask.unsqueeze.expand.float.sum
train_samples.append
io.TextIOWrapper
input
self._modules.values
vectors.torch.cat.transpose
torch.sqrt
self.model.eval
self.noise_fn
self.cos_score_transformation
torch.clamp
annoy.AnnoyIndex.get_nns_by_vector
torch.sqrt.eq
numpy.dot
packaging.version.parse
ir_evaluator
os.path.realpath
app.add_domain
MultiDatasetDataLoader.MultiDatasetDataLoader
para.tokenizer.encode.to
sklearn.decomposition.PCA
torch.long.torch.float.self.config.num_labels.labels.torch.tensor.to
self.model.to
document.replace.split
embeddings.t
os.remove
hnswlib.Index.knn_query
open
test_samples.append
evaluators.append
sentence_transformers.cross_encoder.CrossEncoder
distraction_questions.keys
self._save_checkpoint
any
repo_id.replace
zip.namelist
embeddings2.embeddings1.pytorch_cos_sim.detach.cpu
features.unsqueeze.expand
attention_masks.append
max
os.path.abspath
sentence_transformers.losses.OnlineContrastiveLoss
dev_files.append
os.unlink
self.word2idx.keys
self.datasets.append
label.strip.lower
self.model.train
time.time
text.strip
BatchHardTripletLoss.BatchHardTripletLoss.get_triplet_mask.float
x.dot
anchor.keys
sentence_transformers.SentenceTransformer.encode
str
annoy.AnnoyIndex.build
prediction.size
s.lower
endpoint.HfApi.create_repo
corpus_embeddings.to.to
elasticsearch.helpers.bulk
self.BatchHardTripletLoss.super.__init__
sent_norm.replace.replace.replace
target.write
silver_samples.append
sentence_transformers.models.CNN
self.sentences.append
s.lower.append
name.self.sub_modules.get_sentence_embedding_dimension
distances.eq.float
self.dropout_layer
sentence_transformers.losses.MSELoss
torch.utils.data.DataLoader
sentence_transformers.evaluation.TranslationEvaluator
numpy.sum
os.path.isfile
sentence_transformers.losses.CosineSimilarityLoss
output.append
self.model
type
numpy.argsort
torch.topk
WeightedLayerPooling.load_state_dict
sentences_map.items
self.tokenizer.save
prediction.torch.argmax.eq
labels.BatchHardTripletLoss.get_anchor_negative_triplet_mask.float
connected_nodes
logging.getLogger.setLevel
torch.tensor
transformers.TrainingArguments
row.replace
easynmt.EasyNMT.translate_stream
train_sent.append
model.predict
passage.strip
self.ids.append
zipfile.ZipFile
self.criterion
row.replace.replace.replace
sentence_transformers.readers.STSBenchmarkDataReader
self.generate_data
self.CLIPModel.super.__init__
sklearn.cluster.AgglomerativeClustering
self.emb_layer
features.update
evaluator
negs.min.pow
loss_model.train
BatchHardTripletLoss.get_anchor_positive_triplet_mask
nlpaug.augmenter.word.ContextualWordEmbsAug
ctx.Process.start
os.path.join
x2y_sim.mean
self.model.zero_grad
sklearn.cluster.AgglomerativeClustering.fit
numpy.unique
source_sentence.sentences_map.add
req.headers.get
self.compute_metrics
torch.nn.Dropout
trg_ind.target_sentences.replace
np.stack
model_card.replace.strip
sentence_transformers.datasets.ParallelSentencesDataset.add_dataset
torch.ones
sentence_transformers.SentenceTransformer._first_module
read_eval_dataset
self.samples.values
token_embeddings.shape.torch.arange.unsqueeze
self._load_t5_model
nlpaug.augmenter.word.ContextualWordEmbsAug.augment
AttributeError
es.indices.create
os.path.exists
dev_file.endswith
len.replace
numpy.random.choice
torch.numel
queue.PriorityQueue.get
NotImplementedError
sentence_transformers.util.semantic_search
emb.numpy
torch.where
self.set_vocab
sentence_transformers.InputExample
sentence_transformers.evaluation.MSEEvaluator
header_name.split
pooling_fct_name.pooling_fct.model_card.replace.replace.replace
torch.cuda.device_count
self.ContrastiveTensionLossInBatchNegatives.super.__init__
vectors_concat.append
self.get_sentence_features
ValueError
batch2.append
self.layer_weights.unsqueeze
tokenizer_class.load.set_vocab
model_name_or_path.replace
self.get_labels
line.strip.split
annoy.AnnoyIndex.save
faiss.IndexFlatIP.add
rows.append
easynmt.EasyNMT
k_val.recall_at_k.append
sub_modules.items
_power_method
self.model.text_projection
sentence_transformers.cross_encoder.evaluation.CERerankingEvaluator
labels.device.labels.size.torch.eye.bool
list.extend
positive_pairs.pow.sum
train_sts_samples.append
self.tokenizer_decoder
torch.nn.functional.normalize
math.log
self.model.vision_model
Pooling
self.model.parameters
CNN
map
transformers.AutoTokenizer.from_pretrained.save_pretrained
tuple
activation_fct
k_val.precisions_at_k.append
sent.lower
reps_2.reps_1.torch.matmul.squeeze.squeeze
numpy.log2
torch.nn.Sigmoid
sentence_transformers.SentenceTransformer.encode_multi_process
optimizer_class.zero_grad
score.cpu.detach.numpy
self.ngram_separator.join.lower
np.linalg.norm
self._first_module.get_sentence_features
row.strip
os.path.basename
self.tokenize
sentence_transformers.cross_encoder.CrossEncoder.fit
label.sent1.train_data.add
pooling_mode.lower.lower
graph.keys
sentence_transformers.models.LSTM
labels.float.sum
Normalize
transformers.T5Tokenizer.from_pretrained.encode
triplets.append
tokenizer.decode.replace
models.Transformer.get_word_embedding_dimension
torch.nn.Module.__init__
transformers.T5Tokenizer.from_pretrained.decode
sentence_transformers.models.WordEmbeddings.from_text_file.get_word_embedding_dimension
sentence_transformers.evaluation.InformationRetrievalEvaluator
torch.cuda.amp.GradScaler.step
token_embeddings.size.attention_mask.unsqueeze.expand.float
examples.append
self.handleError
masked_maximums.max
scipy.stats.spearmanr
token_embeddings.shape.torch.arange.unsqueeze.unsqueeze.expand
ctx.Queue.get
hnswlib.Index.load_index
target.startswith
train_samples_MultipleNegativesRankingLoss.append
line.replace.strip
labels.unsqueeze.unsqueeze
label.already_seen.add
LayerNorm.load_state_dict
logging.getLogger.warning
line.rstrip.split
SentenceTransformer._get_scheduler
line.replace.strip.decode.partition
self.ngram_lookup.add
torch.is_tensor
logging.getLogger
torch.hub._get_torch_home
sentence_transformers.models.WKPooling
scheduler.lower.lower
BatchHardTripletLoss.BatchHardTripletLoss.get_triplet_mask
pool.close
labels.unsqueeze.t
self.queries_ids.append
torch.stack.size
util.fullname
negs.min
nltk.tokenize.treebank.TreebankWordDetokenizer.detokenize
transformers.Trainer.train
model
model_name.replace
self.distance_metric.pow
util.pytorch_cos_sim
pbar.update
nltk.tokenize.treebank.TreebankWordDetokenizer
BatchHardTripletLoss.BatchHardTripletLoss.get_triplet_mask.sum
optimizer_class.step
new_cluster.append
dev_sentences1.append
sentence_features.append
torch.nn.CrossEntropyLoss
join.split
query_embeddings.to.unsqueeze
self.auto_model.save_pretrained
ce_loss_fct
SentenceTransformer
module_key.self.sub_modules.tokenize
sentence_transformers.cross_encoder.CrossEncoder.predict
sentence_transformers.SentenceTransformer
callable
torch.nn.utils.rnn.pad_packed_sequence
self.dataset_indices.extend
numpy.concatenate
torch.nn.BCEWithLogitsLoss
transformers.AutoModelForCausalLM.from_pretrained
test_evaluator
negs.mean
print
sentence_transformers.LoggingHandler
app.add_config_value
csv.DictReader
torch.device
sentence_transformers.CrossEncoder
loss_model.zero_grad
sent1.strip.strip
hits.append
anchor_positive_dist.max
batch.append
self.Dense.super.__init__
torch.nn.Embedding
features.self.emb_layer.squeeze
qid.dev_samples.add
self.datasets_iterator.append
BoW
sentence_transformers.evaluation.BinaryClassificationEvaluator
Dropout
numpy.min
search_papers
self._get_scheduler
sentence_embedding.append
hnswlib.Index.init_index
docs.append
filename.self.dataset_folder.os.path.join.gzip.open.readlines
TripletDistanceMetric.vars.items
self.WordWeights.super.__init__
json.loads.keys
sentence_transformers.datasets.NoDuplicatesDataLoader
self.batch_hard_triplet_soft_margin_loss
sklearn.metrics.pairwise.paired_manhattan_distances
tqdm.tqdm.update
list.add
torch.nn.utils.clip_grad_norm_
sorted.add
filename.endswith
tl.mean
requests.get.iter_content
transformers.CLIPModel.from_pretrained
attention_mask.unsqueeze.expand
gold_samples.append
train_path.endswith
model_card_templates.ModelCardTemplate.model_card_get_pooling_function
word.lower
wiki_doc_freq.open.readlines
loss_model.named_parameters
all_layer_embedding.weight_factor.sum
os.rename
sentence_transformers.CrossEncoder.predict
query_embeddings.to.to
huggingface_hub.Repository.push_to_hub
parallel_sentences.append
self.ContrastiveTensionLoss.super.__init__
faiss.IndexFlatIP
image_text_info.append
PIL.Image.open
self.logit_scale.exp
line.strip
semantic_search
transformers.AutoModelForMaskedLM.from_pretrained.save_pretrained
self.get_config_dict
lm_logits.view
dev_duplicates.append
os.path.relpath
tarfile.open.extract
faiss.IndexIVFFlat
self.isEnabledFor
ImportError
model_card.replace.replace
torch.tensor.transpose
numpy.zeros
queries.keys
numpy.asarray.append
WordEmbeddings
line.rstrip
fIns.append
CLIPModel
self.decoder.resize_token_embeddings
self.csv_headers.append
huggingface_hub.HfApi
target_embeddings.self.source_embeddings.mean
data.min
transformers.DataCollatorForLanguageModeling
logging.basicConfig
seen_src.add
self.emb_layer.load_state_dict
opustools.OpusRead
tqdm.tqdm.write
transformers.CLIPProcessor.from_pretrained
query.replace.strip
model.encode.append
elasticsearch.Elasticsearch.search
outfile.line.strip
large_files.append
num_positives.append
new_sentences.append
self.tokenizer_encoder.batch_decode
os.makedirs
hnswlib.Index.add_items
pytrec_eval.RelevanceEvaluator.evaluate
faiss.IndexIVFFlat.train
evaluation.BinaryClassificationEvaluator.find_best_acc_and_threshold
labels.unsqueeze
tqdm.autonotebook.tqdm.close
sentence_transformers.evaluation.RerankingEvaluator
transformers.Trainer
scores_top_k_idx.cpu.tolist.cpu
numpy.max
texts_values.append
train_examples.append
sentence_transformers.cross_encoder.CrossEncoder.save
labels.float
token.strip.lower
weights.append
distances.self.margin.F.relu.pow
json.load.import_from_string
self.tokenizer.tokenize
dataloader.get_config_dict
attention_mask.float.unsqueeze
sentence_transformers.losses.DenoisingAutoEncoderLoss
torch.from_numpy.size
corpus.keys
torch.quantization.quantize_dynamic
coloredlogs.DEFAULT_LEVEL_STYLES.copy
dev_samples.keys
text.lower.split
torch.nn.Conv1d
sentence_transformers.cross_encoder.evaluation.CECorrelationEvaluator.from_input_examples
sentence_transformers.evaluation.SequentialEvaluator
torch.quantization.quantize_dynamic.evaluate
sentence_transformers.datasets.SentenceLabelDataset
util.import_from_string
pooling_fct.model_card.replace.replace
self._model_card_vars.items
model.eval
all_docs.extend
self.layer_weights.unsqueeze.unsqueeze
torch.mm
document.replace
transformers.T5EncoderModel.from_pretrained
translated_query.replace
transformers.MT5EncoderModel.from_pretrained
csv.reader
logging.info
poss.max
transformers.get_constant_schedule
torch.nn.Parameter
paragraph.strip

@developer Could please help me check this issue? May I pull a request to fix it? Thank you very much.

tomaarsen commented 5 months ago

The NLTK dependency has been removed, so this concern is resolved.