Swissqoute Import on English Files, 'SwissquotePDFExtractorEn.java'?

sebnapi commented 2 years ago

Is your feature request related to a problem? Please describe. The pdf import for Swissqoute is only in German.

Describe the solution you'd like I'd like to pdf import swissqoute documents in english.

Additional context

I don't know how you would treat this problem. The regular expressions seemed reasonable, but they are in German. So to make it simple, I have just created a second Swissqoute extractor SwissquotePDFExtractorEn.java and have added it to the PDFImportAssistant.java.

package name.abuchen.portfolio.datatransfer.pdf;

import java.math.BigDecimal;
import java.math.RoundingMode;

import name.abuchen.portfolio.datatransfer.pdf.PDFParser.Block;
import name.abuchen.portfolio.datatransfer.pdf.PDFParser.DocumentType;
import name.abuchen.portfolio.datatransfer.pdf.PDFParser.Transaction;
import name.abuchen.portfolio.model.AccountTransaction;
import name.abuchen.portfolio.model.BuySellEntry;
import name.abuchen.portfolio.model.Client;
import name.abuchen.portfolio.model.PortfolioTransaction;
import name.abuchen.portfolio.model.Transaction.Unit;
import name.abuchen.portfolio.money.Money;
import name.abuchen.portfolio.money.Values;

@SuppressWarnings("nls")
public class SwissquotePDFExtractorEn extends AbstractPDFExtractor
{
    public SwissquotePDFExtractorEn(Client client)
    {
        super(client);

        addBankIdentifier("Swissquote"); //$NON-NLS-1$

        addBuySellTransaction();
        addDividendsTransaction();
        addAccountFeesTransaction();
    }

    @Override
    public String getLabel()
    {
        return "Swissquote Bank Ltd"; //$NON-NLS-1$
    }

    private void addBuySellTransaction()
    {
        DocumentType type = new DocumentType("Stock-Exchange Transaction: (Buy|Sell)");
        this.addDocumentTyp(type);

        Transaction<BuySellEntry> pdfTransaction = new Transaction<>();
        pdfTransaction.subject(() -> {
            BuySellEntry entry = new BuySellEntry();
            entry.setType(PortfolioTransaction.Type.BUY);
            return entry;
        });

        Block firstRelevantLine = new Block("^Stock-Exchange Transaction: (Buy|Sell) .*$");
        type.addBlock(firstRelevantLine);
        firstRelevantLine.set(pdfTransaction);

        pdfTransaction
                // Is type --> "Verkauf" change from BUY to SELL
                .section("type").optional()
                .match("^Stock-Exchange Transaction: (?<type>Sell) .*$")
                .assign((t, v) -> {
                    if (v.get("type").equals("Sell"))
                    {
                        t.setType(PortfolioTransaction.Type.SELL);
                    }
                })

                // APPLE ORD ISIN: US0378331005 NASDAQ New York
                // 15 193 USD 2'895.00
                .section("name", "isin", "currency")
                .match("^(?<name>.*) ISIN: (?<isin>[\\w]{12}) .*$")
                .match("^[\\.'\\d]+ [\\.'\\d]+ (?<currency>[\\w]{3}) [\\.'\\d]+$")
                .assign((t, v) -> t.setSecurity(getOrCreateSecurity(v)))

                // 15 193 USD 2'895.00
                .section("shares")
                .match("^(?<shares>[\\.'\\d]+) [\\.'\\d]+ [\\w]{3} [\\.'\\d]+$")
                .assign((t, v) -> t.setShares(asShares(v.get("shares"))))

                // Betrag belastet auf Kontonummer  99999901, Valutadatum 07.08.2019
                // Betrag gutgeschrieben auf Ihrer Kontonummer  99999900, Valutadatum 07.02.2018
                // Amount debited from your account number  99999901, on value date of 18.08.2021
                .section("date")
                .match("^Amount (debited|credited) (from|from your) account number([\\s]+)? [\\d]+, ([\\s]+)? on value date of (?<date>[\\d]{2}\\.[\\d]{2}\\.[\\d]{4})$")
                .assign((t, v) -> t.setDate(asDate(v.get("date"))))

                // Zu Ihren Lasten USD 2'900.60
                // Zu Ihren Gunsten CHF 8'198.70
                // To your debit EUR 2'049.80
                .section("currency", "amount")
                .match("^To your (debit|credit) (?<currency>[\\w]{3}) (?<amount>[\\.'\\d]+)$")
                .assign((t, v) -> {
                    t.setAmount(asAmount(v.get("amount")));
                    t.setCurrencyCode(v.get("currency"));
                })

                // Total DKK 37'301.50
                // Wechselkurs 15.0198
                // Exchange rate 1.1889
                .section("fxCurrency", "fxAmount", "exchangeRate").optional()
                .match("^Total (?<fxCurrency>[\\w]{3}) ([\\s]+)?(?<fxAmount>[\\.'\\d]+)$")
                .match("^Exchange rate (?<exchangeRate>[\\.'\\d]+)$")
                .assign((t, v) -> {
                    // read the forex currency, exchange rate and gross
                    // amount in forex currency
                    String forex = asCurrencyCode(v.get("fxCurrency"));
                    if (t.getPortfolioTransaction().getSecurity().getCurrencyCode().equals(forex))
                    {
                        BigDecimal exchangeRate = asExchangeRate(v.get("exchangeRate"));
                        BigDecimal reverseRate = BigDecimal.ONE.divide(exchangeRate, 10,
                                        RoundingMode.HALF_DOWN);

                        // gross given in forex currency
                        long fxAmount = asAmount(v.get("fxAmount"));
                        long amount = reverseRate.multiply(BigDecimal.valueOf(fxAmount))
                                        .setScale(0, RoundingMode.HALF_DOWN).longValue();

                        Unit grossValue = new Unit(Unit.Type.GROSS_VALUE,
                                        Money.of(t.getPortfolioTransaction().getCurrencyCode(), amount),
                                        Money.of(forex, fxAmount), reverseRate);

                        t.getPortfolioTransaction().addUnit(grossValue);
                    }
                })

                // Total DKK 35'410.5
                // Wechselkurs 14.9827
                // Exchange rate 1.1889
                // CHF 5'305.45
                .section("amount", "currency", "exchangeRate", "fxCurrency", "fxAmount").optional()
                .match("^Total (?<fxCurrency>[\\w]{3}) (?<fxAmount>[\\.'\\d]+)$")
                .match("^Exchange rate (?<exchangeRate>[\\.'\\d]+)$")
                .match("^(?<currency>[\\w]{3}) (?<amount>[\\.'\\d]+)$")
                .assign((t, v) -> {
                    Money forex = Money.of(asCurrencyCode(v.get("fxCurrency")), asAmount(v.get("fxAmount")));
                    Money gross = Money.of(asCurrencyCode(v.get("currency")), asAmount(v.get("amount")));

                    BigDecimal exchangeRate = asExchangeRate(v.get("exchangeRate"));
                    type.getCurrentContext().put("exchangeRate", exchangeRate.toPlainString());

                    if (forex.getCurrencyCode().equals(t.getPortfolioTransaction().getSecurity().getCurrencyCode()))
                    {
                        Unit unit;
                        // Swissquote sometimes uses scaled exchanges
                        // rates (such as DKK/CHF 15.42, instead of
                        // 0.1542,
                        // hence we try to extract and if we fail, we
                        // calculate the exchange rate
                        try
                        {
                            unit = new Unit(Unit.Type.GROSS_VALUE, gross, forex, exchangeRate);
                        }
                        catch (IllegalArgumentException e)
                        {
                            exchangeRate = BigDecimal.valueOf(((double) gross.getAmount()) / forex.getAmount());
                            type.getCurrentContext().put("exchangeRate", exchangeRate.toPlainString());

                            unit = new Unit(Unit.Type.GROSS_VALUE, gross, forex, exchangeRate);
                        }
                        t.getPortfolioTransaction().addUnit(unit);
                    }
                })

                .wrap(BuySellEntryItem::new);

        addTaxesSectionsTransaction(pdfTransaction, type);
        addFeesSectionsTransaction(pdfTransaction, type);
    }

    private void addDividendsTransaction()
    {
        DocumentType type = new DocumentType("(Dividend|Capital Gain)");
        this.addDocumentTyp(type);

        Block block = new Block("^(Dividend|Capital Gain) Our reference:(.*)$");
        type.addBlock(block);
        Transaction<AccountTransaction> pdfTransaction = new Transaction<AccountTransaction>()
                .subject(() -> {
                    AccountTransaction entry = new AccountTransaction();
                    entry.setType(AccountTransaction.Type.DIVIDENDS);
                    return entry;
                });

        pdfTransaction
                // HARVEST CAPITAL CREDIT ORD ISIN: US41753F1093NKN: 350
                // Dividende 0.08 USD
                .section("name", "isin", "currency")
                .match("^(?<name>.*) ISIN: (?<isin>[\\w]{12}).*$")
                .match("^(Dividend|Capital Gain) ([\\.'\\d]+) (?<currency>[\\w]{3})$")
                .assign((t, v) -> t.setSecurity(getOrCreateSecurity(v)))

                // Anzahl 350
                // Quantity 350
                .section("shares")
                .match("^Quantity (?<shares>[\\.'\\d]+)$")
                .assign((t, v) -> t.setShares(asShares(v.get("shares"))))

                // Ausführungsdatum 19.06.2019
                // Execution date
                .section("date")
                .match("^Execution date (?<date>[\\d]{2}\\.[\\d]{2}\\.[\\d]{4})")
                .assign((t, v) -> t.setDateTime(asDate(v.get("date"))))

                // Total USD 19.60
                .section("currency", "amount")
                .match("^Total (?<currency>[\\w]{3}) (?<amount>[\\.'\\d]+)")
                .assign((t, v) -> {
                    t.setAmount(asAmount(v.get("amount")));
                    t.setCurrencyCode(v.get("currency"));
                })

                .wrap(TransactionItem::new);

        addTaxesSectionsTransaction(pdfTransaction, type);
        addFeesSectionsTransaction(pdfTransaction, type);

        block.set(pdfTransaction);
    }

    private void addAccountFeesTransaction()
    {
        DocumentType type = new DocumentType("Custody fees");
        this.addDocumentTyp(type);

        Block block = new Block("^Custody fees Our reference:(.*)$");
        type.addBlock(block);
        block.set(new Transaction<AccountTransaction>()

                .subject(() -> {
                    AccountTransaction transaction = new AccountTransaction();
                    transaction.setType(AccountTransaction.Type.FEES);
                    return transaction;
                })

                .section("date", "amount", "currency")
                .match("^Value date (?<date>[\\d]{2}\\.[\\d]{2}\\.[\\d]{4})$")
                .match("^Amount debited (?<currency>[\\w]{3}) (?<amount>[\\.'\\d]+)$")
                .assign((t, v) -> {
                    t.setDateTime(asDate(v.get("date")));
                    t.setAmount(asAmount(v.get("amount")));
                    t.setCurrencyCode(asCurrencyCode(v.get("currency")));
                    t.setNote("Custody fees");
                })

                .wrap(TransactionItem::new));
    }

    private <T extends Transaction<?>> void addTaxesSectionsTransaction(T transaction, DocumentType type)
    {
        transaction
                // Abgabe (Eidg. Stempelsteuer) USD 4.75
                .section("currency", "tax").optional()
                .match("^Tax \\(Federal stamp duty\\) (?<currency>[\\w]{3}) (?<tax>[\\.'\\d]+)$")
                .assign((t, v) -> processTaxEntries(t, v, type))

                // Quellensteuer 15.00% (US) USD 4.20
                // Withholding tax
                .section("currency", "withHoldingTax").optional()
                .match("^Withholding tax [\\.'\\d]+% \\(.*\\) (?<currency>[\\w]{3}) (?<withHoldingTax>[\\.'\\d]+)$")
                .assign((t, v) -> processWithHoldingTaxEntries(t, v, "withHoldingTax", type))

                // Zusätzlicher Steuerrückbehalt 15% USD 4.20
                .section("currency", "tax").optional()
                .match("^Zus.tzlicher Steuerr.ckbehalt [\\.'\\d]+% (?<currency>[\\w]{3}) (?<tax>[\\.'\\d]+)$")
                .assign((t, v) -> processTaxEntries(t, v, type))

                // Verrechnungssteuer 35% (CH) CHF 63.88
                .section("currency", "tax").optional()
                .match("^Verrechnungssteuer [\\.'\\d]+% \\(.*\\) (?<currency>[\\w]{3}) (?<tax>[\\.'\\d]+)$")
                .assign((t, v) -> processTaxEntries(t, v, type));
    }

    private <T extends Transaction<?>> void addFeesSectionsTransaction(T transaction, DocumentType type)
    {
        transaction
                // Kommission Swissquote Bank AG USD 0.85
                // Commission Swissquote Bank Ltd USD 0.85
                .section("currency", "fee").optional()
                .match("^Commission Swissquote Bank Ltd (?<currency>[\\w]{3}) (?<fee>[\\.'\\d]+)$")
                .assign((t, v) -> processFeeEntries(t, v, type))

                // Börsengebühren CHF 1.00
                // Stock exchange fee 
                .section("currency", "fee").optional()
                .match("^Stock exchange fee (?<currency>[\\w]{3}) (?<fee>[\\.'\\d]+)$")
                .assign((t, v) -> processFeeEntries(t, v, type));
    }

    @Override
    protected long asAmount(String value)
    {
        return PDFExtractorUtils.convertToNumberLong(value, Values.Amount, "de", "CH");
    }

    @Override
    protected long asShares(String value)
    {
        return PDFExtractorUtils.convertToNumberLong(value, Values.Share, "de", "CH");
    }

    @Override
    protected BigDecimal asExchangeRate(String value)
    {
        return PDFExtractorUtils.convertToNumberBigDecimal(value, Values.Share, "de", "CH");
    }
}

I tried to run it, but after the mvn -f portfolio-app/pom.xml clean verify I can't find a working jar, I'm on a mac. Two regexes are not translated as I didn't came across them. Could you help me with that?

Nirus2000 commented 2 years ago

Why a new importer? The existing importer can be very easily extended to include the English language. Just add the keywords to the existing regex. Then create the TestCases for verification. You can find the test cases here.

For a example:

    private void addBuySellTransaction()
    {
        DocumentType type = new DocumentType("Stock-Exchange Transaction: (Buy|Sell)");
        this.addDocumentTyp(type);
        ...

Change to

    private void addBuySellTransaction()
    {
        DocumentType type = new DocumentType("(B.rsentransaktion|Stock\\-Exchange Transaction): (Kauf|Verkauf|Buy|Sell)");
        this.addDocumentTyp(type);
        ...

Or

                // Zu Ihren Lasten USD 2'900.60
                // Zu Ihren Gunsten CHF 8'198.70
                // To your debit EUR 2'049.80
                .section("currency", "amount")
                .match("^To your (debit|credit) (?<currency>[\\w]{3}) (?<amount>[\\.'\\d]+)$")
                .assign((t, v) -> {
                    t.setAmount(asAmount(v.get("amount")));
                    t.setCurrencyCode(v.get("currency"));
                })

to

                // Zu Ihren Lasten USD 2'900.60
                // Zu Ihren Gunsten CHF 8'198.70
                // To your debit EUR 2'049.80
                .section("currency", "amount")
                .match("^(Zu Ihren|To your) (Lasten|Gunsten|debit|credit) (?<currency>[\\w]{3}) (?<amount>[\\.'\\d]+)$")
                .assign((t, v) -> {
                    t.setAmount(asAmount(v.get("amount")));
                    t.setCurrencyCode(asCurrencyCode(v.get("currency")));
                })

As you can see, only single lines are expanded. Please remember to escape all special characters like (\.[]{}()<>*+-=!?^$|).

Alternatively you can create PDF debug, then we would take care of it. You can see how it works in the video tutorial.

Video tutorial: Extract PDF documents for debugging

Alex :-)

sebnapi commented 2 years ago

This is certainly possible, but as a software engineer myself I wouldn't go that way. The regular expressions get harder to maintain with each language added. I would use class inheritance for it and have the regular expressions in variables (it depends on your experience of the other extractors what makes the most sense, variables, static variables, or inheritable methods returning strings or regexs):

I will use pseudo code to demonstrate the idea, because Java is too verbose and I don't have an IDE installed currently:

class SwissquotePDFExtractorEn(AbstractPDFExtractor):
    protected String regex_buy_sell_transaction = "^Stock-Exchange Transaction: (Buy|Sell) .*$"
    protected String regex_dividends_transaction = "^(Dividend|Capital Gain) Our reference:(.*)$"

   ... (implement methods using variables)

Then only overwrite the variables and be done

class SwissquotePDFExtractorDe(SwissquotePDFExtractorEn):
    protected String regex_buy_sell_transaction = "^B.rsentransaktion: (Kauf|Verkauf) .*$"
    protected String regex_dividends_transaction = "^(Dividende|Kapitalgewinn) Unsere Referenz:(.*)$"

Then I would use a factory to determine the used extractor or which in your current situation is easier: build a composite pdfextractor, which basically lets the document pass through two or more extractors (so that you don't have menu items for all broker-language-combinations).

class CompositePDFExtractor(AbstractPDFExtractor):

     CompositePDFExtractor(AbstractPDFExtractor ...extractors):
           ...
     extract() throws NotApplicable
          for(BasePDFExtractor extractor: extractors):
               extractor.extract()

sebnapi commented 2 years ago

I still need help to run the program, what needs to be run after clean verify? after this I only end up with the folders "name.abuchen.portfolio.[...]"

portfolio-performance / portfolio

Swissqoute Import on English Files, 'SwissquotePDFExtractorEn.java'? #2823