Closed bernardmizzi closed 4 years ago
Make sure you successfully downloaded the language model (modes/language_model/finbertTRC2/pytorch_model.bin
should be about 400 MB). Try to use git-lfs or directly download the model from GitHub webpage.
Thanks for your feedback.
Moreover, how I can construct the files train.csv, validation.csv, test.csv?
Apologies if I was clear, but my main question is how to retrieve the train, validation and test data and put it in those files?
Hi all, I also met the problem when I ran the configuring parameters cell. I'm trying to download the pytorch_model.bin with git-lfs then but getting this error. It seems a service limit. Kindly be asked for any helps. Great Thanks!
Take a look at this https://github.com/ProsusAI/finBERT/issues/8
Thanks for your help! But after running wget https://github.com/ProsusAI/finBERT/raw/master/models/language_model/finbertTRC2/pytorch_model.bin the file I got is also the size 134kb one not the original 400Mb one.
Thanks for your help! But after running wget https://github.com/ProsusAI/finBERT/raw/master/models/language_model/finbertTRC2/pytorch_model.bin the file I got is also the size 134kb one not the original 400Mb one.
When I did a git lfs pull
, it tells me that:
"batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access. error: failed to fetch some objects from 'https://github.com/ProsusAI/finBERT.git/info/lfs'"
This is probably related to this issue.
You could manually download https://github.com/ProsusAI/finBERT/raw/master/models/language_model/finbertTRC2/pytorch_model.bin from the browser, that's what I did
Thanks for your help! But after running wget https://github.com/ProsusAI/finBERT/raw/master/models/language_model/finbertTRC2/pytorch_model.bin the file I got is also the size 134kb one not the original 400Mb one.
Did you try manually downloading the file from the browser from https://github.com/ProsusAI/finBERT/raw/master/models/language_model/finbertTRC2/pytorch_model.bin? It worked for me and the downloaded file is approximately 400MB
You could manually download https://github.com/ProsusAI/finBERT/raw/master/models/language_model/finbertTRC2/pytorch_model.bin from the browser, that's what I did
Can you share your local copy of the model file? This method no longer works due to GitHub bandwidth restrictions. I can download the file but it's only 134 bytes. Thank you
@davidifshk you can also use my link if you want ^
@bernardmizzi Thank you. This is going to benefit more people with the same issue.
No problem, glad I could help
@bernardmizzi
Sorry to ask again, but could you please also share the model under classifier_model/finbert-sentiment. I believe that could not be downloaded as well. Really appreciate your help!
That model is created when trained on certain text, you'll have to run the notebook finBERT/notebooks/finbert_training.ipynb as mine is trained on certain text. If you want i'll give you mine but it is trained on reddit news headlines and obviously it reported very low accuracy.
That's okay. Thank you very much!
Should you need help with running the notebook just send me a message as I got it up and running.
@davidifshk you can also use my link if you want ^
It works! Thank you very much! I'm going to run the training with the dataset from FinancialPhraseBank first.
Apologies if I was clear, but my main question is how to retrieve the train, validation and test data and put it in those files?
Kindly be asked for the data structure of train.csv that I got an error when ran the cell 'get_data()'. Here is the data structure of my train.csv. Is there anything wrong?
Apologies if I was clear, but my main question is how to retrieve the train, validation and test data and put it in those files?
Kindly be asked for the data structure of train.csv that I got an error when ran the cell 'get_data()'. Here is the data structure of my train.csv. Is there anything wrong?
fixed. I used wrong sep character ',' to export csv file
@davidifshk I wan't able to run the model on the PhaseBank Dataset as I was getting encoding errors on both windows and ubuntu systems. Thus I opted for another dataset.
ic, I have already run the model on the PhaseBank Dataset that result is shown below.
@davidifshk would it be a problem to provide me the code you used to open and format the PhraseBank dataset as I was getting encoding errors?
Im trying to use finbert for classification of new articles into several different categories in the banking domain . Which model should i use for classification . Natual language model or the classification model . Thanks.
You have to run the notebook FinBERT/notebooks/finbert_training.ipynb which will train the language model, then it will create a new classification model, which then, will continuing running the notebook, will use it for classification
@bernardmizzi Your link to model from google drive has expired, can you re-upload it please? When trying to download model from repository I get error:
This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.
Thanks a lot @bernardmizzi ! Could you upload also the sentiment model weights?
The model is already pre-trained and can be used. I think the model weights are embedded within the model. To run finbert, all you need s the pythorch model bin file and its config.
Indeed, weights are embedded within a model. It's just that there are 2 different models on this repo, one is language model and one is sentiment model (see picture below). On your drive you uploaded the language model, could you upload the sentiment model too? Thanks!
You'll have to run the notebook finbert_training.ipynb since the model you are asking for is fine-tuned (trained) on a certain dataset, and that depends on which dataset you want
I actually need it fine-tuned on financial news, so if you can upload the fine-tuned version of the sentiment-analysis one, I'd be glad! Thank you anyway.
@bernardmizzi you're right, didn't went carefully enough through the read me to notice that. Thanks for your help! @clone95 I will fine-tune the model for the sentiment analysis in the following days and can then upload that version
Apologies if I was clear, but my main question is how to retrieve the train, validation and test data and put it in those files?
Hi, how to settle this issue?
Good morning,
I am running the configuring parameters cell and I am getting the below error:
UnpicklingError Traceback (most recent call last)