Download and Parse Public Data Released by B3 Exchange #534

Closed msperlin closed 1 year ago

msperlin commented 2 years ago

Date accepted: 2023-02-08

Submitting Author Name: Marcelo S. Perlin
Submitting Author Github Handle: @msperlin
Other Package Authors Github handles: @wilsonfreitas
Repository:
Version submitted: 0.0.3
Submission type: Standard
Editor: @emilyriederer
Reviewers: @pachadotdev, @quishqa

Due date for @pachadotdev: 2022-10-16 Due date for @quishqa: 2022-10-24

Archive: TBD Version accepted: TBD Language: en

Package: rb3
Title: Download and Parse Public Data Released by B3 Exchange
Description: Download and parse public files released by B3 and convert them
    into useful formats and data structures common to data analysis
Version: 0.0.3
Authors@R: c(person("Wilson", "Freitas",
                    email = "",
                    role = c("aut", "cre")),
             person("Marcelo", "Perlin",
                    email = "",
                    role = "aut"))
License: MIT + file LICENSE
LazyData: true
    R (>= 4.1.0),
VignetteBuilder: knitr
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.2
Config/testthat/edition: 3
Encoding: UTF-8


The package downloads and organizes raw financial data directly from B3, the main financial exchange in Brazil. These datasets are not available in any other way.

Academic researchers and practioners of financial markets.



Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code]( - (*Scope: Do consider MEE's [Aims and Scope]( for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

msperlin commented 2 years ago


Package License: MIT + file LICENSE

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

emilyriederer commented 2 years ago

Hello @msperlin and many thanks for your submission.

We are discussing whether the package is in scope and need a bit more information.

I see you are the author of other packages such as GetDFDData2 already on CRAN. Could you please help to clarify the differences between these two packages? Is the former focused only on financial statements whereas this package provides more granular time series? While I'm sure your two packages are not "competing", I want to fully understand what makes this package novel and how it is best in class.

Similarly, I see a B3 data listed on general services like Yahoo Finance which I know can be accessed through a number of packages including yfR, quantmod, and tidyquant. Could you elaborate on why none of these packages are in a "comparison set"?

Thank you!

msperlin commented 2 years ago

Hi @emilyriederer

Please find my replies below:

I see you are the author of other packages such as GetDFDData2 already on CRAN. Could you please help to clarify the differences between these two packages? Is the former focused only on financial statements whereas this package provides more granular time series? While I'm sure your two packages are not "competing", I want to fully understand what makes this package novel and how it is best in class.

The different between GetDFPDAta2 and rb3 is in their scope and audience. Package GetDFPData2 is related to financial documents released by companies to the exchange and its audience is mostly business analysts. This includes sales, profit, and many other financial information on a annual basis. Meanwhile, you cant think of rb3 as an interface to all trade data available at the b3 website, including prices of many different type of contracts such as options, equities (companies) and futures. ITs audience is more related to traders and market participants.

While there is some relationship between datasets from rb3 (stock prices) and GetDFPData2 (financial statements) for equities, there is none for other types of markets.

Similarly, I see a B3 data listed on general services like Yahoo Finance which I know can be accessed through a number of packages including yfR, quantmod, and tidyquant. Could you elaborate on why none of these packages are in a "comparison set"?

The data is indeed simillar, but only for equities (companies). Yahoo finance, for example, does not provide historical prices for futures or option contracts. Package rb3 also provides access to historical yield curves, which is a very unique (and rare) set of data in finance.

msperlin commented 2 years ago

@wilsonfreitas I just fixed the code for codemeta and contributing file. The other problems are deeper in the code. Can you please have a look?

wilsonfreitas commented 2 years ago

Hi @emilyriederer and @msperlin

I added the examples to the functions show_templates and display_templates. rb3 is not a function, it is just the tag @name being used in the package documentantion, I changed that.

And with respect to <<- operator, it is used with a ReferenceClass, and accoding to documentation this is the suggest way to modify objects fields. So, I believe that this is not an issue.

wilsonfreitas commented 2 years ago

Furtherly, the test coverage is now 84%.

wilsonfreitas commented 2 years ago

emilyriederer commented 2 years ago

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

emilyriederer commented 2 years ago

Thanks for the answers on the package uniqueness, @msperlin . I really appreciate the additional context on the package's unique value. I will proceed to look for editors for this package. Would you mind adding more details and context to the README? After reading it someone with little domain knowledge should have been informed about the aim, goals and functionality of the package.

msperlin commented 2 years ago

Thanks @emilyriederer. I added to the the explicit datasets available at rb3. This should give context to the reader and set rb3 apart from other packages.

emilyriederer commented 2 years ago

Hello again, @msperlin !

Thanks for updating the README.

I noticed that you are also currently going through active review for yfR (#523) right now. Discussing with the editorial board, we think the best policy is for developers to undertake reviews sequentially. That way, any relevant feedback or discussions from one review can be applied more seamlessly to the next.

As such, I am going to apply the hold tag for now. Once #523 is complete, please ping me in this thread and we will pick back up where we left off.

msperlin commented 2 years ago

No problem @emilyriederer. Makes sense to me. I'll get back here once yfR finishes its review (probably in a month from now).

maelle commented 2 years ago

@msperlin do you still intend to submit this package?

msperlin commented 2 years ago

Hi @maelle,

@wilsonfreitas is the main author of the package and has done far more work on the code than myself. I believe the decision should be his.

wilsonfreitas commented 2 years ago

Hi @maelle and @msperlin let's go!

I believe we have to check if the changes broke the checks we have done in the past.

@msperlin could you help me with that?

msperlin commented 2 years ago

Great! I believe the code should be fine. I'll try to start the check from here.

msperlin commented 2 years ago

@ropensci-review-bot check package

Checks for rb3 (v0.0.7)

git hash: c8149e1c

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

emilyriederer commented 2 years ago

Thanks for the clarification, @mpadge ! With that, I think it sounds like we should be ready to proceed with the review. I will begin to work towards assigning a handling editor.

emilyriederer commented 2 years ago

@ropensci-review-bot assign @emilyriederer as editor

emilyriederer commented 2 years ago

Hi @msperlin and @wilsonfreitas - since my EiC term has ended, I have self-volunteered to be the editor for this package as well. I will begin considering reviewers, but in the mean time, please let me direct your attention to a few items found in the automated checks:

emilyriederer commented 2 years ago

Editor checks:

Editor comments

I think this package is quite close to being read for review. I just want to note a few core issues:

Additionally, from the automated checks, I want to highlight a few items in particular:

wilsonfreitas commented 2 years ago

Hi @emilyriederer,

Thanks for your suggestions. I'm answering your questions here.

Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling? Ok, I see valuable points here, I'll work on them.

Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled? Yes, it is, but we don't have an structured process to be managed.

I'm going to work on the other points: dependencies, formatting, function names ... Thanks

emilyriederer commented 2 years ago

Hi @wilsonfreitas ! I hope you are well. I thought I'd follow up and see if you had a chance to make any updates in response to my comments? Most all of my comments were a bit open-ended and not a hard, binary requirement. Once you've incorporated them to your satisfaction, I can begin to look for reviewers.

wilsonfreitas commented 2 years ago

Hi @emilyriederer, thanks!!

Unfortunately I am slower than I'd like, but I have a few updates.

In the, I notice there are no instructions about setting up API keys for testing. Can you please confirm that users do not need to provide any authentication information for these APIs?

The package makes heavy use of web scraping, it does not require any API to access the data and no setup is necessary for testing. Should I make this clear in the

Please review and resolve the formatting issues found by lintr


You may consider whether it is a concern that index_get() is a function name used in other packages. I doubt if this package would be used much with elastic so I don't foresee the conflict arising frequently, but if another name suits you equally well, it might be better to change

If this is not blocking I'd prefer to keep index_get as function name.

Some stated imports (e.g. cli) are not found as being used. Can this dependency be removed?

Currently, I am working on this point.

Since this is an API-heavy package, please consider the rOpenSci guidance for http testing best practices. I think some of the tools listed there could be helpful to make the current checks more robust

I am reading this book to see what can be applied in the package.

wilsonfreitas commented 2 years ago

Hi @emilyriederer

I think I am done for now. The last commits have many changes implementing your suggestions. We can go forward with the process.

emilyriederer commented 2 years ago

Checks for rb3 (v0.0.7)

git hash: a68bc21f

(Checks marked with :eyes: may be optionally addressed.)

Package License: MIT + file LICENSE

Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

emilyriederer commented 2 years ago

Thanks @wilsonfreitas ! This all looks great. I will begin to look for reviewers 🎉

emilyriederer commented 2 years ago

@ropensci-review-bot seeking reviewers

emilyriederer commented 2 years ago

@ropensci-review-bot assign @pachadotdev as reviewer