[SPIKE] Investigate splitting e2e tests into stubbed vs full e2e mode

Currently in our e2e tests, we use a Testnet account to run through a few scenarios like sending a payment, adding a trustline, etc. These scenarios generate tx's and actually submit to Testnet.

The benefits of re-using the same account and not stubbing out these calls are:

We get signal that all our integrated API's are continuing to work as we expect. This gives us a more complete picture of if Freighter is actually working for the end user and we're not missing anything with some stubbing that is overly optimistic
Tests run a bit faster because we don't need to be constantly creating and funding new accounts to test with
For Soroban token testing, we don't need to worry about minting to this newly generated account
Having to stub out calls makes creating tests slower and more complex. It's very fast to create an e2e test without stubbing. This is important as we really need a lot more coverage of our code.

The cons are:

The tests are non-deterministic. If a test fails in an unexpected way, the test account might end up in a weird state that causes subsequent test failures. This could be come even more of a problem with multiple devs running tests at the same time
Issues with any 3rd party API, even if transient, can cause a test failure. For example, in times of high congestion on Testnet, a test could fail just for taking too long

Possible solution: We default to stubbing out these calls when the tests run in CI and, maybe before we cut a beta/prod release, we flip a toggle that allows these tests to run against actual API's. These full API test runs would happen infrequently and by only one dev at a time.

Open Questions:

How easy/hard is it to create this kind of toggle in Playwright?
Can we run it from a Github Action?
How much work/effort will it be to stub out all the existing calls in our current e2e tests?

stellar / freighter

[SPIKE] Investigate splitting e2e tests into stubbed vs full e2e mode #1653