s-leroux / fin

Set of tools for personal investment
MIT License
1 stars 0 forks source link

Consider adding a tool to extract fundamental data from web pages #48

Open s-leroux opened 1 month ago

s-leroux commented 1 month ago

There is a lot of free fundamental data available on web pages. We already have experience with a web scrapper: 162c94410c603a488efedf407451939db3be676c

The code above was written especially for Investing.com. Can we have something more generic to parse table-like data?

The requirements are to be able to parse table elements, but also eventually pseudo-tables made of div span constructs.

s-leroux commented 1 month ago

Here is a sample HTML fragment from Yahoo! Finance:

<!DOCTYPE html>
<html>
<head>
  <meta name="generator" content=
  "HTML Tidy for HTML5 for Linux version 5.2.0">
  <title></title>
</head>
<body>
  <div class="">
    <h3 class="Mt(20px)"><span>Cash Flow Statement</span></h3>
    <table class="W(100%) Bdcl(c)">
      <tbody>
        <tr class="Bxz(bb) H(36px) BdY Bdc($seperatorColor)">
          <td class=
          "Pos(st) Start(0) Bgc($lv2BgColor) fi-row:h_Bgc($hoverBgColor) Pend(10px) Miw(140px)">
          <span>Operating Cash Flow</span> <!-- -->(ttm)</td>
          <td class="Fw(500) Ta(end) Pstart(10px) Miw(60px)">
          17.13M</td>
        </tr>
        <tr class="Bxz(bb) H(36px) BdB Bdbc($seperatorColor)">
          <td class=
          "Pos(st) Start(0) Bgc($lv2BgColor) fi-row:h_Bgc($hoverBgColor) Pend(10px)">
          <span>Levered Free Cash Flow</span> <!-- -->(ttm)</td>
          <td class="Fw(500) Ta(end) Pstart(10px) Miw(60px)">
          -210.33M</td>
        </tr>
      </tbody>
    </table>
  </div>
</body>
</html>
s-leroux commented 1 month ago

Stopped for now.