reZach / my-budget

Free, open source offline cross-platform budgeting solution built with Electron.
GNU Affero General Public License v3.0
956 stars 61 forks source link

Feature: Bank connection (EU banks) #3

Closed xvilo closed 2 years ago

xvilo commented 5 years ago

I would love to see a way of adding your bank through it, you have several fintech banks providing apis:

Or think of a bigger collective:

The idea is to have your transactions automatically imported and be able to sort these out, it is quite some work to manually do this

reZach commented 5 years ago

@xvilo It's in the works.

reZach commented 5 years ago

@xvilo This is now available with limited support. Please see here for more information.

reZach commented 5 years ago

Currently trying to get a hold of customer service to try and get OFX support implemented.

ross-ritchey commented 5 years ago

@reZach -> Plaid is free for up to 100 account connections. I would suggest building out integration for Plaid and adding a configuration section in the app for people to add THEIR OWN Plaid account credentials. This takes the cost off of you, but still allows people to use a service like Plaid. It also removes the herculean effort needed to build out connectors to individual banks.

reZach commented 5 years ago

@ross-ritchey I thought so too, but it's only actually 100 successful connections. I did try to use Plaid for testing, but after a successful connection/import, the number of connections went down, to 99, 98...

That was the same line of thought I was originally thinking too.

ross-ritchey commented 5 years ago

@reZach That is really strange. Their pricing page says 100 "items" and defines an item as "a set of credentials at a financial institution" - which infers that you don't have a connection limit.

Though that said -> their pricing page is terrible at actually delivering any useful information so I'm not surprised to see that type of limit imposed. Oh well, it was a thought.

reZach commented 5 years ago

@ross-ritchey I thought so too. I thought I had up to 100 sets of credentials to use, but then I thought about it I said "who really has logins to 100 financial institutions?"

It makes sense they have a limit, otherwise everyone would be on the free tier.

Dan-inpooling commented 5 years ago

If it's on Plaid - then it won't be "our information stays offline".

reZach commented 5 years ago

This issue really captures two things.

1) (Which was listed), pulling in transaction details from EU banks. After reading a little bit from the bunq webpage, it looks like we need certain types of licenses to use or access APIs around UK banks to pull transaction data. 2) Pulling in transaction details from non-EU banks. This is partially done and being worked on at the moment.

I'm going to create a new issue for detail 2, and update detail on this issue to reflect EU banks.

reZach commented 5 years ago

I created a new issue to track the status of non-EU banks: https://github.com/reZach/my-budget/issues/8.

I added a help wanted label because there are obvious legal protections in place for EU banks for importing transactions and we need to be compliant if necessary. @xvilo are you familiar with the rules or can get in touch with someone with legal expertise in this area? Thanks.

xvilo commented 5 years ago

Well, don't pin me down on this, but PSD2 only applies if you're a service connecting to bank details of a customer. As your not doing this it's probably just fine. However, I don't know about some legal person, but I can ask around

reZach commented 5 years ago

That'd be great, thanks. Technically, one could say that My Budget is a "service" and then its in legal trouble (which I don't want to have).

xvilo commented 5 years ago

I meant a hosted solution. Which it isn't.

reZach commented 5 years ago

I'd personally still like some professional legal advice around this manner.

reZach commented 5 years ago

@ross-ritchey @xvilo @Dan-inpooling

I've done some thinking around this matter, and am considering re-implementing with Plaid. They would allow us to sync transactions to banks in the UK & the US.

To your point, Dan, yes - our information would not stay offline.

I'd like your opinions, as I want to provide what people are asking for (syncing transactions), but also want to keep your data secure (simply not enough people are volunteering to write a "connector" for their bank).

Dan-inpooling commented 5 years ago

I am not sure if I am the good person to answer the question as I am working on data privacy. If you use Plaid, then I don't see the difference between your tool and other existing on-line budget tools. A lot of them are free as well. For me, the most valuable of your project is that this will stay off-line. Hope this won't discourage you. Other alternative that I can think - instead of connecting with banks directly, how about a offline tool that translate the monthly statement into the budget format then adjust automatically month by month?

reZach commented 5 years ago

@Dan-inpooling I too am concerned about privacy and want it to be important in the application, I don't want to compromise on this, however there is no help from the community to create more "connectors" for banks.

I can only hope to contribute to this effort, making it easier (perhaps writing a browser extension that can auto-generate the code so non-tech users can create connectors on their own). How do you feel about that?

I'm not too sure how downloading the monthly statement into a budget format would work, but I do appreciate the new direction you are taking with your suggestion!

Dan-inpooling commented 5 years ago

I like the idea of "Browser extension".

For a non-tech user, browser extension is sort of easier to handle. But i don't know yet what you mean by "creating connectors" on their own. But I am interested in knowing how this will work. I know some fintech companies do data scrapping as they get really bad API connections with banks. But if consumers do offline "data scrapping" on their own accounts, this sounds totally legitimate for me.

On a different note - we actually created a free and open source browser extension for data privacy : https://github.com/Dan-inpooling/Privacy-eye. Feel free to give your feedback.

reZach commented 5 years ago

I wrote up how to make a connector here. A connector is code that connects (logs into your banks website) and retrieves transaction data that is imported into your budget. It's synonymous with a "screen scraper" or how other providers pull transaction details too.

Right now, there's only one for one bank - there is a guide on the link I posted, but it's towards a technical audience. My idea about the extension is, after enabling the browser extension, you would log into your bank and navigate to the page that has transaction data. After clicking on fields (transaction name, transaction date, transaction amount), the extension would generate screen scraping code to pull this transaction data for this particular bank.

Users would copy/paste this code in a PR/issue here, and I would add it to the app. Now, any user who uses the same bank would be able to pull in transactions automatically (w/o compromising privacy). It's bold, it will take some time, but it's doable.

I starred your repo, I really like it - I will have to use it/try it out. On an aside, have you heard of the pi-hole? It's something I think you would get behind.

Dan-inpooling commented 5 years ago

I like it ! I think it makes a lot of sense the browser extension for non-tech users. How much time did you spend to create the first connector? If someone else takes your code and tries to create for another bank, how much time would he/she expect to spend? I did not look into your code, yet. But I can imagine most of the code can be reused for another bank.

Yeah. Nice Pi-hole. Thanks for starring us. I did the same on yours as well:)

reZach commented 5 years ago

Thanks. You can reuse most of it, except for the regex that is going to be specific to the HTML of your bank's website. I'd expect it would take 35 minutes in total - not exactly super-fast.

Always happy to support others :)

I created a repo if you'd like to work on the extension with me: https://github.com/reZach/my-budget-connector-creator.

reZach commented 5 years ago

@Dan-inpooling I've got a prototype working at the repo linked above. I'll make a short outline as an issue there so we can continue the conversation there.

Dan-inpooling commented 5 years ago

Thanks for setting this up. I will let you know when I have some down time to work on this. Sorry for the late response.

reZach commented 5 years ago

@Dan-inpooling That's quite alright, let me know and I can talk through what I'm thinking how it can work.

3flex commented 5 years ago

@reZach I caught your comment on HN related to an "open source Plaid" and was planning to reach out.

I'm curious to see how things go with your extension, but I see some potential issues:

I hope that makes sense?

I think part of the reason that there's no open source Plaid is simply because no one developer can create it in isolation because they don't have access to the various checking and savings accounts, loan accounts and credit cards that need to be parsed. Instead of another bespoke solution that only works for My Budget I'd propose creating a repository of scrapers as a separate project, but one that My Budget could use.

My thoughts are:

There's a gap here for anyone who cares about privacy, personal finance and open source. That might not be a huge number of people but it's hopefully enough to get a project like this off the ground.

I'm working on a proof of concept at the moment with Australian banks, and once I've built something that works would love to get your feedback and hopefully also your input.

Appreciate any comments you have!

reZach commented 5 years ago

Thanks for your input @3flex. Here are my thoughts.

Pagination - need to recognise when more records are on the next page, and when the end of the list is reached.

For this case, I'm envisioning the user to view a page where they can view all transactions, so we can simply ignore the pagination altogether.

Some banks generate awful HTML which might make creating and verifying selectors for scrapers difficult. Many banks use JSON APIs in browser which can be intercepted and make extracting data much simpler. An extension designed to scrape HTML won't be able to take advantage of the structured JSON.

I didn't think of using this (smart idea!). However, I think it is both harder and less privacy-centric to watch all of the xhr requests a browser makes, and decides which particular one is the one holding transaction data (if present at all). I was thinking instead for the extension, a user will click on the elements that hold particular pieces of data, and the code can "guess" how the html is structured and try to find the other transactions. Will take some logic, but I think it's do-able.

Some banks need specific filters set in the UI before they can download a full transaction history. These would have to be included in the connector and the extension may not help unless it can record events on page like setting values of certain drop downs and applying filters. The alternative is to have users do that themselves when navigating to the page but then you can't run Puppeteer/Chrome in headless mode since you need user input and also site-specific instructions for the initial load.

Yes this will be needed to be handled somehow.

And to your other points.

My thoughts are:

  • Build scrapers that can be executed either by Puppeteer (to support CLI and Electron apps etc) or by a browser extension
  • Get input from multiple developers so we can get reasonable coverage for multiple banks (and countries - I'm in Australia)
  • Ensure all scrapers are built to return data in a common format (using TypeScript would help)
  • Make sure it's easy to utilise in other projects

As you say these things, this makes me think of a scraper that can be built, and translated into different formats (Puppeteer code, library 2's code, etc). The scraper can capture things like "select this data", "store this data", "modify this data" - and then when it generates Puppeteer code, it can translate these actions into code Puppeteer expects. Just a thought.

I agree on your other three points.

Do you have a public repo where you are working on this right now, I'd love to see and happily help you with it if you've already started it!

3flex commented 5 years ago

Ideas are cheap and mine is mostly just that at the moment, but I'm pleased to hear you think it has some merit. Your idea about code generation is a good one. I'd come across https://github.com/MontFerret/ferret and was wondering if that might be a good basis for something like that but it would be a big project to port or make something similar for other languages. There's a WASM build but I haven't looked into it.

I don't know if you can avoid pagination that easily, none of the banks I use have any view that would show all transactions I'm interested in on a single view or page. The most I've seen is 200 per page but that's not enough, majority don't even show that many.

I don't know if the privacy risk of intercepting XHR is actually that high, since the point of using a tool like this is to ingest all bank transactions anyway. The couple of scrapers I've built that do this will only intercept relevant URLs so that helps too... It's actually harder to check every call then see if it contains transactions than only checking data returned from specific endpoints.

Anyway I have a lot more work to do before I'm ready to share anything but I'll comment here when (or if) I have something. Thanks!

reZach commented 5 years ago

@3flex great, thanks for this information!

I'd come across https://github.com/MontFerret/ferret and was wondering if that might be a good basis for something like that but it would be a big project to port or make something similar for other languages. There's a WASM build but I haven't looked into it.

I agree, a port would be a lot of work on this.

I don't know if you can avoid pagination that easily, none of the banks I use have any view that would show all transactions I'm interested in on a single view or page. The most I've seen is 200 per page but that's not enough, majority don't even show that many.

Mine has a view where I can see the last 2 years of transactions. Hm - I suppose each bank would be different on this. In either case, this tool needs to account for both cases.

I don't know if the privacy risk of intercepting XHR is actually that high, since the point of using a tool like this is to ingest all bank transactions anyway. The couple of scrapers I've built that do this will only intercept relevant URLs so that helps too... It's actually harder to check every call then see if it contains transactions than only checking data returned from specific endpoints.

I agree, although it requires some technical knowledge on the part of the user to find these endpoints.

I don't know if the privacy risk of intercepting XHR is actually that high, since the point of using a tool like this is to ingest all bank transactions anyway. The couple of scrapers I've built that do this will only intercept relevant URLs so that helps too... It's actually harder to check every call then see if it contains transactions than only checking data returned from specific endpoints.

Sounds good, thanks!

reZach commented 5 years ago

@Dan-inpooling @3flex

I've got an extension partially creating connector code for My Budget: https://github.com/reZach/page-recorder. It still has some kinks to work through, but making progress and wanted to share with you both!