Closed bhavaygg closed 3 years ago
Thanks for reporting!
Honestly, we do not have much experience with Windows systems, but could you try to simply rename your biome.py
script (to my_biome.py
for example). Maybe it is just the namespaces.
Thanks that fixed it. But i am running into another error.
df=pd.read_csv("bert_train.csv")
df_train, df_test = train_test_split(df, test_size=0.1, random_state=RANDOM_SEED)
df_val, df_test = train_test_split(df_test, test_size=0.5, random_state=RANDOM_SEED)
pipeline_dict = {
"name": "prot",
"tokenizer": {
"text_cleaning": {
"rules": ["strip_spaces"]
}
},
"features": {
"word": {
"embedding_dim": 64,
"lowercase_tokens": True,
},
"char": {
"embedding_dim": 32,
"lowercase_characters": True,
"encoder": {
"type": "gru",
"num_layers": 1,
"hidden_size": 32,
"bidirectional": True,
},
"dropout": 0.1,
},
},
"head": {
"type": "TextClassification",
"labels": ["0","1"],
"pooler": {
"type": "gru",
"num_layers": 1,
"hidden_size": 32,
"bidirectional": True,
},
"feedforward": {
"num_layers": 1,
"hidden_dims": [32],
"activations": ["relu"],
"dropout": [0.0],
},
},
}
from biome.text import Pipeline
pl = Pipeline.from_config(pipeline_dict)
from biome.text.configuration import VocabularyConfiguration, WordFeatures
print(df_train)
vocab_config = VocabularyConfiguration(sources=[df_train], min_count={WordFeatures.namespace: 1000})
pl.create_vocabulary(vocab_config)
My dataframe looks like this
text label
371 MKK KKH KHH HHH HHH HHH HHH HHL HLV LVP VPR PR... 1
257 GSH SHM HMG MGS GSP SPN PNS NSP SPL PLK LKD KD... 1
and the stack trace is
File "biomeee.py", line 58, in <module>
pl.create_vocabulary(vocab_config)
File "D:\Anaconda\envs\myenv\lib\site-packages\biome\text\pipeline.py", line 750, in create_vocabulary
vocab = self._extend_vocabulary(vocabulary.create_empty_vocabulary(), config)
File "D:\Anaconda\envs\myenv\lib\site-packages\biome\text\pipeline.py", line 690, in _extend_vocabulary
instances_vocab = Vocabulary.from_instances(
File "D:\Anaconda\envs\myenv\lib\site-packages\allennlp\data\vocabulary.py", line 292, in from_instances
instance.count_vocab_items(namespace_token_counts)
AttributeError: 'str' object has no attribute 'count_vocab_items'
We do not support working with pandas DataFrames directly, but you can always create a Dataset
from a DataFrame:
from biome.text import Dataset
train_ds = Dataset.from_pandas(df_train)
vocab_config = VocabularyConfiguration(sources=[train_ds], min_count={WordFeatures.namespace: 1000})
pl.create_vocabulary(vocab_config)
I assume you installed biome.text from master, which we recommend at the moment (until the new release):
pip install -U git+https://github.com/recognai/biome-text.git
Let me know if i can be of any further help!
Closing this, @Chokerino feel free to create another issue if you have more questions
Describe the bug
ModuleNotFoundError: No module named 'biome.text'; 'biome' is not a package Tried install from source and from pip but both give same error
To Reproduce
OS environment
Additional context
biome --help
works on cmd in both cases and python version is 3.7