Open nihey opened 8 years ago
Do we have some way to get data from B3 already?
As of now, there is no open current way to scrape it that I know of.
It seems there are however some paid alternatives, like: Bolsa Financeira
Upon doing some search, there seems to be way, this page gets it using a WebSocket protocol, the data seems to appear like this:
There is also this API, it seems to return just stock codes, but I do not know the full extent of it:
$ curl "https://data-bovespa.tradingview.com/search/?text=ABEV3&exchange=BMFBOVESPA&type=&domain=bovespa" -H "Origin: https://s.tradingview.com"
Whenever I get some free time, I'll try to migrate node-bovespa
to use this (it won't be as smooth wrapping a REST API, but it is doable).
This is weird. TradingView uses data from ICE data services. My main concern is due to the reliability of data, so I wanted to have data as close as possible from bovespa's services.
Do you guys got any progress on finding another source for the data?
Alpha Vantage seems to be a great alternative. It is free (although it needs registration to obtain a API Key), and have quotes from B3 (with some delay, of course). It is also worth noting that some of the assets listed on B3, like subscription rights, are not available in Alpha Vantage.
And of course, depending on your goal, it might not be a good idea to rely on a third party for stock quotes.
@lenilsonjr Thanks for the information, Alpha Vantage might be the easiest way to got from now. I'll work on it as soon as possible.
@lenilsonjr @nihey I could not find B3 quotes in Alpha. How you should pass in the URL? CSNA:B3? This is not working for me
@lenilsonjr @nihey I could not find B3 quotes in Alpha. How you should pass in the URL? CSNA:B3? This is not working for me
Don't know if it's still relevant to you, but I could get appending .SA
to the stock symbol, similar how Yahoo Finance does.
Example:
Hey everyone, I'm planning to make this project alive once more.
I've been thinking about the plans on how to do it and found out that we can download historical series on these files.
My plan to is to make an API out of it for historical data querying. After doing it, the plan is to offer it publicly (or self hosted via the package).
Once the historical data API is done, I can add a realtime API via webscraping from various sources. It may be hard to maintain it, but we'll see the best way to handle it once we reach this point.
I've managed to setup a new API:
https://bovespa.nihey.org/api/quote/ABEV3/2018-02-14
I'll rewrite the code for node-bovespa
so that it uses this brand new API, until then, I recommend using this API directly for whoever needs consuming this data.
Note: it currently do not have realtime data, just historic ones.
Hey, this is awesome. I just read your email and I am all up to make this project be big. How can I help? Do we have a roadmap?
@djalmaaraujo Great to see you're interested :smile: !
I'll get everything organized this week and possibly create issues, but as a brief explanation on everything I wanted for this project, we should add:
Extract realtime data via webscraping from some of these platforms
Unit testing to receive notifications when the realtime API stop working because one of those sites changed.
Include the realtime data into the API
regularly update the historic series with official data from bovespa
a CLI utility to extract data from the API
# Something like:
$ bovespa ABEV3 -d 2018-01-08
<outputs the data>
an easy way to access the data
// Something like:
const bovespa = require('bovespa');
bovespa.get("ABEV3", "2018-01-08").then(...)
:arrow_up: That would be it for now, but I'm open for suggestions too.
@nihey A few days ago I received an email for a legal department of XP and a while ago also from bank safra, because I had open source projects with their name or part of it. I am pretty sure this will be a near future for this project as-well, which sucks.
Also, I really love the idea but while I am very interested I am on zero time now, so I am sorry if I passed the wrong impression in my last comment.
I think the roadmap is great but I would ignore the CLI for now since this is not very used compared to an API. That's my opinion.
I will try to find time for this anytime soon. tks
@djalmaaraujo
A few days ago I received an email for a legal department of XP and a while ago also from bank safra, because I had open source projects with their name or part of it. I am pretty sure this will be a near future for this project as-well, which sucks.
That sucks a lot, I really hope that if this happens here, it takes a long time to happen.
Also, I really love the idea but while I am very interested I am on zero time now, so I am sorry if I passed the wrong impression in my last comment.
No problems!
I think the roadmap is great but I would ignore the CLI for now since this is not very used compared to an API. That's my opinion.
I think you're right, although the CLI can be quite handy sometimes, it is not nearly as useful as the API
I should definitely add some documentation not only for the node
API, but for the REST API too, and this should be the main focus for the project (ideally aiming into the real time API).
I will try to find time for this anytime soon. tks
Alright, thanks!
I was searching for some solution to get real time and endup here...
Any thoughts in get real-time data?
I'm doing one dashboard for myself my current solution is using google sheet, is not the best one as it does not reflect the current stock price (they say it can be like 20min delay), but at least I get something close to the current stock price like using the formula =GOOGLEFINANCE("ITUB4"; "price")
Maybe for historical data, you can use the same? I think it's going to be pretty easy to maintain...
@krolow but how do you get the data from google sheets? I mean, is there a way to get the sheet generated data from code running outside (not using google script)?
BTW, I was playing with Alpha Vantage (using Node), but they now have some limits ("our standard API call frequency is 5 calls per minute and 500 calls per day") and, in this case, it would work just to get historical data - for example, updating daily a database with 500 assets at most.
so the way that i'm using right now I have a public sheet with all the stock codes and I put it to export as CSV and I consume that CSV url when I need to fetch it... not perfect but for my own purposes it's working well...
and what I did was to also have several columns so I can have the history of the stock in the last week, 30d ago, 60d ago etc...
@krolow This might be useful too: https://sheetsu.com/ I have not used them yet, but its looks like a good alternative to extract the data from a spreadsheet via API.
nice I did a small demo just to test the idea https://github.com/krolow/carteira/ it's basically consuming the CSV of a public sheet
Hey guys, not sure if this helps at some point, but I am building a integrated database with all the companies/papers in b3.
https://www.tradertax.com.br/api/v1/b3_companies.json?query=xp
At some point this will be private, or not. I get info from a bunch of sources. Some companies don't have the CNPJ attached, I am working on that.
@djalmaaraujo I'm not sure if node-bovespa
could use it, but I can see some uses of it on some of my personal projects.
Seeing these last updates I think I've came up with a reliable solution to keep the real time data. I'll take some time to implement it, but it surely will be a major enhancement on this project.
@nihey That's what I thought. Are you using the spreadsheet solution or alphavantage?
@djalmaaraujo I'm using the SpreadSheet solution for now.
@nihey I could not find in the GOOGLEFINANCE formula ETF's: SMLL11 for example. Can you?
@djalmaaraujo I could not find SMLL11 indeed, even though some other ETF's could be found (like XBOV11
). Seems like we may have to extract these data from AlphaVantage (or another source anyway).
have u tried SMAL11? it worked for me here, also I have tried ETF IVVB11 works as well
@krolow Interesting. The hard part is to find all possibilities, because it's not the ticker name.
yeah, at least the FIIS and stocks itself all the codes that I have tried work just fine, maybe only the ETF ones that have this difference? 🤔
On Mon, Jul 22, 2019 at 11:56 AM Djalma Araújo notifications@github.com wrote:
@krolow https://github.com/krolow Interesting. The hard part is to find all possibilities, because it's not the ticker name.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nihey/node-bovespa/issues/5?email_source=notifications&email_token=AAALNZVVUSIRRRNNHOXE4UDQAXDCTA5CNFSM4CI2UNJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2QGFQY#issuecomment-513827523, or mute the thread https://github.com/notifications/unsubscribe-auth/AAALNZVA23ER2OSI2NSVNDTQAXDCTANCNFSM4CI2UNJQ .
--
Vinícius Krolow - Software Engineer http://krolow.com.br
So, I'm finally coming up with a solution, I've Google Finance API Proxy, consuming data from spreadsheets, storing them in the database. If any of you want to try it out, I'm keeping the API at: https://bovespa.nihey.org/api/realtime (or https://bovespa.nihey.org/api/realtime/ABEV3).
I'll also add AlphaVantage API proxying at some point, but it seems like this should be enough for the next version.
So, finishing a couple of things will mark the next version of node-bovespa
:
cli
README.md
@nihey This is a great addition. Great job. Realtime is actually 15min, right? I saw google formulas uses 15min delay, I think
@djalmaaraujo Yes, that's right! As Google Finance API actually has delayed data, we are limited by it, it is not actually "realtime".
Hello,
I've been trying to solve the same issue, but I don't quite like the Google Finance solution. As mentioned before, there is no SMAL11 and other ETFs. It also does not have commodities, such as gold (OZ1D, OZ2D, OZ3D), and also no options.
Real time is an issue, because it appears there is no API at all for the assets I mentioned. Some APIs suck as Yahoo Finance and Alpha Vantage only have reliable delayed data for SMAL11 and other ETFs but not for commodities. But all of those break on the time series data beyond a couple of days or so. Looks like Alpha Vantage keeps reliable data in 5m intervals for everything it has for 2 days only.
For daily data I've been using B3 historical data, which is behind a captcha and available about 3 hours after marked closes. It has reliable data for most options, ETFs and BDRs as well as regular stocks. But being behind a captcha makes it ok for development but bad for automation.
Another problem with B3 is that it does not contain splits and adjusted data points. And I was not able to find an open api for splits and other corporate events.
So recently I discovered BMF FTP service, which is 2 days behind, but it has all operations, not only open/close/min/max/vol data points. It is easy to crawl and automate. But I don't have any idea if it will be kept up indefinitely.
ftp://ftp.bmf.com.br/MarketData/NEG_LAYOUT_english.txt
The plan I am thinking for my project is as follows:
This looks like the least shortcoming possible. The only missing data within 2 days would be commodities and 5m and 1min data is restricted to 1 day only on all other assets.
I thought I should share the idea here, as I still didn't build an abstraction and DB schema for that and could potentially use and contribute to this project if you are also aiming to close the gaps on ETFs, BDRs and commodities as best as we can. Do you have any thoughts on that?
Forgot to mention: Looks like splits and dividends are possible to crawl from Alpha Vantage's TIME_SERIES_DAILY_ADJUSTED
. Not sure how it would fit in my idea yet 🤔
I've been trying to solve the same issue, but I don't quite like the Google Finance solution. As mentioned before, there is no SMAL11 and other ETFs. It also does not have commodities, such as gold (OZ1D, OZ2D, OZ3D), and also no options.
Yeah, over the time, I've found google_finance solution to be quite limited. I'll make another version with it just so we can fill some gaps, but it surely won't be enough.
For daily data I've been using B3 historical data, which is behind a captcha and available about 3 hours after marked closes. It has reliable data for most options, ETFs and BDRs as well as regular stocks. But being behind a captcha makes it ok for development but bad for automation.
I'm using those to provide the historical API too. The funny thing is: their CAPTCHA is not actually very strong. If you get the download link for their files and look at its name, you'll find out you can download it straight from the website, without having to solve the CAPTCHA again. This allowed me to automate their data extraction entirely.
Example: You can download 2019 data directly from: http://bvmf.bmfbovespa.com.br/InstDados/SerHist/COTAHIST_A2019.zip . If you want further details, you can look on this file, where I do this.
When bootstrapping the DB, use BMF FTP data, easy to crawl and extract data with filesystem → gunzip → split → parse streams Crawl BMF FTP daily for 2 days ago quotes All BMF data stored in DB Use Alpha Vantage plus in memory cache for past 2 days
I think this is the way to go. This is actually what I'm doing with this project too. The only things that could be improved are the data sources used to feed the database. As of now, I believe the best setting would be: B3's site historical data for past day data, Google Finance + AlphaVantage for realtime.
Forgot to mention: Looks like splits and dividends are possible to crawl from Alpha Vantage's TIME_SERIES_DAILY_ADJUSTED . Not sure how it would fit in my idea yet
That's a very good information to crawl, sounds like something that would be very hard to extract directly from B3's website. Thanks for the information!
Here is something I found out during the weekend:
Closing price of COTAHIST_A2019.zip and similar files (don't know how long back it goes) is wrong! It is using the highest buy offer at closing time instead of last successful transaction. From BMF FTP not only I can get Options and Commodities historical data, but also information looks more reliable.
I didn't compare with Alpha Vantage or other free APIs. I did compare COTAHIST_A2019.zip to NEG_20190812.gz (after running reduce operations) and also with my homebroker COTAHIST_A2019.zip was the one that looked wrong to me.
I am not sure I want to report this to BMF, in case they discover this URLs are still up and decide to remove those. IDK if you notice, but they stopped linking this particular files from the official B3 website. The URL still exists, but the navigation won't let people without the bookmark find it again.
I plan to have this code extracted from my codebase as small npm modules. Maybe stuff like 'bmf-parser' and 'cotahist-parser'. Something like that, that we can all share and help maintain, improve etc. For now, each day takes about 7 seconds on my i7 computer with SSD. I am reading the compressed file and streaming results using es-stream
internally. I'll try to have some time to accomplish this before end of week, but might take a bit of time, as I don't like to post npm packages without testing and I am still in prototype stage (no tests written yet).
Hi, we have all this data and more, from bonds, funds to equities. We want to make it avaliable for more people to use (for free), but we need some help. If you guys want, we can talk to see what we can develop together: gabriel@carteiraglobal.com
👋 hey guys! It was a loooong thread... I just landed here after bootstrap another solution this weekend and I learned some things here in the hard way 😅! Great job you are doing here @nihey... nice parser u wrote! So! My 50 cents to this thread will be:
curl https://arquivos.b3.com.br/apinegocios/ticker/ITUB4/2020-06-08
@gfcantu I would like to help
@milesibastos Hey! Its good to hear you've liked it!
My work has been a little rough on the last weeks/months and I haven't been able to give as much attention to the project as I would like to.
curl https://arquivos.b3.com.br/apinegocios/ticker/ITUB4/2020-06-08
This API looks very promising, I believe it will be the easiest one to integrate so far.
Just sharing on this thread some thoughts I shared with @gfcantu privately too:
It seems we can also integrate Yahoo Finance as another source, they have an API that exposes JSON Data. An example of their API is:
bash "https://query1.finance.yahoo.com/v8/finance/chart/ITSA4.SA?region=US&lang=en-US&includePrePost=false&interval=2m&range=1d&corsDomain=finance.yahoo.com&.tsrc=finance"
@irae That's good to know, and it's really bad :disappointed: . Unfortunately we still do not have other sources on the database to compare to COTAHIST_A2019.zip
files, But after adding at least one or two, one key feature would be to compare the prices all sources provide and select the ones that are less conflicting between them.
Wow! I was about to reply saying that Yahoo Finance API is not worth because of symbols ending in "11" were broken. But apparently...
In turn, this fixed both the graphs on their site and AlphaVantage now has full data for a bug report I've sent both Yahoo and AlphaVantage about 2 years ago.
Details collapsed because I don't want to rewrite everything, I'll leave in the original tone for historical sake 😄
BTW, SMAL11 is fixed:
curl "https://query1.finance.yahoo.com/v8/finance/chart/SMAL11.SA?region=US&lang=en-US&includePrePost=false&interval=2m&range=1ain=finance.yahoo.com&.tsrc=finance"
IMAB11 (used to just 404) now exists, but data is broken like SMAL11 was before:
curl "https://query1.finance.yahoo.com/v8/finance/chart/IMAB11.SA?region=US&lang=en-US&includePrePost=false&interval=2m&range=1d&corsDomain=finance.yahoo.com&.tsrc=finance"
What a marvelous job over here @nihey looking forward to work with it on a new project I'm thinking about: using a Telegram bot account that sends notifications to the user according to the price of the desired stock, just to get a nice heads up when certain stock passes a resistance for example. 15min I think is pretty good enough for my purpose, I mean, realtime would be a killer feature to process on the long run considering the limited amount of calls the bot may have when ping the API.
The Yahoo Finance implementation seems feasible too, best wishes and good trades!
Guys i also need some help to gather bovespa stocks from Alpha Vantage, i tryed MTSA4, MTSA4.SAO, PETR4, PETR4.SAO, ^IBOV, any symbol give me a return. Can someone show me where i'm wrong?
@Bordotti did you read the past comments? There is a lot of correct stock symbols for bovespa. I am pretty sure you can spot how it is done.
Sorry I wasn't clear, i need the fundamental data, the function TimeSeries it worked.
Guys i also need some help to gather bovespa stocks from Alpha Vantage, i tryed MTSA4, MTSA4.SAO, PETR4, PETR4.SAO, ^IBOV, any symbol give me a return. Can someone show me where i'm wrong?
I don't know if this can help you man:
https://www.alphavantage.co/query?function=SYMBOL_SEARCH&keywords=IBOV&apikey=HBOWYYPHIYUXVK86
In my case, I'm trying to find a source to get 5min interval data of the "Mini Ibovespa" index (WIN) in intraday.
Does anyone know if I can get it in Yahoo Finance API ?
@nihey , congrats about the project! 😉 And everyone else, you are doing a great job so far! 🏆
A hope: https://developers.b3.com.br/
Ontem passei o dia trabalhando nesta api: https://b3-price.tradertax.com.br/tickers Ela tem um número limitado de tickers, por motivos pessoais, mas caso queiram o codigo, só não vou deixar open por que já recebi cartinha de advogado da xp por causa de crawler, mas se quiserem me diz que eu adiciono no projeto
Mas resumindo, eu to usando esse link: https://arquivos.b3.com.br/apinegocios/ticker/ITUB4/2020-06-08
Criei uma API atualizada a cada 24 horas com TODOS os tickers da bovespa: Inclui fundos, bdr, opções.
São 11.6mb de dados em um single request, não pus paginação ainda, então vou liberar caso alguém queira via request, sem custos. Manda email para mim em djalma [at] nossomos.cc.
@djalmaaraujo Eu tenho interesse em contribuir com o projeto.
I need to find another way to scrape the data.