bhftbootcamp / CcyConv.jl

CcyConv is a Julia package for performing currency conversions. It allows for direct and multi-step conversions using the latest exchange πŸ’± rates
Apache License 2.0
7 stars 2 forks source link

Add a new pathfinding algorithm #1

Open gryumov opened 3 months ago

gryumov commented 3 months ago

At the moment, the package implements only one algorithm for finding a path - A*. This algorithm searches for the shortest path between two currencies (minimum number of conversions). But what if we want to find the most profitable currency exchange chain?

Description

Add a new algorithm to the package that is capable of finding the most advantageous chain of currency conversions between two specified currencies.

The "most advantageous chain of currency conversions" is defined as a sequence of exchanges from currency A to currency B such that the product of the exchange rates (the weights of the edges in the currency exchange graph) is optimized. If the goal is to sell currency A, the algorithm should maximize this product; if the goal is to buy currency A, the algorithm should minimize it.

Tasks

Tests

The algorithm must pass the following tests:

graph LR;
    A --> |50| F
    A --> |5| B --> |4| D --> |2| F;
    B --> |1| C --> |2| D;
    C --> |2| B;
    A --> |4| C --> |2| E --> |1| F;
    X --> |42| Y;
    Y --> |24| X;

    classDef julia_blue fill:#4063D8,stroke:#333,stroke-width:2px;
    classDef julia_green fill:#389826,stroke:#333,stroke-width:2px;
    classDef julia_red fill:#CB3C33,stroke:#333,stroke-width:2px;
    classDef julia_purple fill:#9558B2,stroke:#333,stroke-width:2px;
    classDef def_color fill:#eee,stroke:#ccc,stroke-width:2px;

    class A,F julia_blue;
    class B,E julia_red;
    class C,X julia_green;
    class D,Y julia_purple;

Testset

using Test
using CcyConv

my_graph = FXGraph()

append!(
    my_graph,
    [
        Price("A", "F", 50.0),
        Price("A", "B", 5.0),
        Price("A", "C", 4.0),
        Price("B", "D", 4.0),
        Price("B", "C", 1.0),
        Price("C", "B", 2.0),
        Price("C", "D", 2.0),
        Price("C", "E", 2.0),
        Price("D", "F", 2.0),
        Price("E", "F", 1.0),
        Price("X", "Y", 42.0),
        Price("Y", "X", 24.0),
    ],
)

# Your maximum algorithm
function your_max_alg(
    fx::FXGraph,
    from_id::UInt64,
    to_id::UInt64,
)::Vector{Pair{UInt64,UInt64}}
    # your alg code
end

# Your minimum algorithm
function your_min_alg(
    fx::FXGraph,
    from_id::UInt64,
    to_id::UInt64,
)::Vector{Pair{UInt64,UInt64}}
    # your alg code
end

conv_max(fx, from, to) = fx(CcyConv.MyCtx(), your_max_alg, from, to)
conv_min(fx, from, to) = fx(CcyConv.MyCtx(), your_min_alg, from, to)

@testset "Pathfinding algorithm" begin
    @testset "Test β„–1: A -> F" begin
        a_star_rate = conv_a_star(my_graph, "A", "F")
        max_rate = conv_max(my_graph, "A", "F")
        min_rate = conv_min(my_graph, "A", "F")

        @test max_rate |> conv_value β‰ˆ 64.0
        @test conv_value(a_star_rate) < conv_value(max_rate)
        @test length(conv_chain(a_star_rate)) <= length(conv_chain(max_rate))
        @test max_rate |> conv_chain == [
            Price("A", "C", 4.0),
            Price("C", "B", 2.0),
            Price("B", "D", 4.0),
            Price("D", "F", 2.0),
        ]

        @test min_rate |> conv_value β‰ˆ 8.0
        @test conv_value(min_rate) < conv_value(a_star_rate)
        @test length(conv_chain(a_star_rate)) <= length(conv_chain(min_rate))
        @test min_rate |> conv_chain == [
            Price("A", "C", 4.0),
            Price("C", "E", 2.0),
            Price("E", "F", 1.0),
        ]
    end

    @testset "Test β„–2: F -> A" begin
        a_star_rate = conv_a_star(my_graph, "F", "A")
        max_rate = conv_max(my_graph, "F", "A")
        min_rate = conv_min(my_graph, "F", "A")

        @test max_rate |> conv_value β‰ˆ 0.2
        @test conv_value(a_star_rate) < conv_value(max_rate)
        @test length(conv_chain(a_star_rate)) <= length(conv_chain(max_rate))
        @test max_rate |> conv_chain == [
            Price("E", "F", 1.0),
            Price("C", "E", 2.0),
            Price("C", "B", 2.0),
            Price("A", "B", 5.0),
        ]

        @test min_rate |> conv_value β‰ˆ 0.02
        @test conv_value(min_rate) <= conv_value(a_star_rate)
        @test length(conv_chain(a_star_rate)) <= length(conv_chain(min_rate))
        @test min_rate |> conv_chain == [
            Price("A", "F", 50.0),
        ]
    end

    @testset "Test β„–3: A -> B" begin
        a_star_rate = conv_a_star(my_graph, "A", "B")
        max_rate = conv_max(my_graph, "A", "B")
        min_rate = conv_min(my_graph, "A", "B")

        @test max_rate |> conv_value β‰ˆ 50.0
        @test conv_value(a_star_rate) < conv_value(max_rate)
        @test length(conv_chain(a_star_rate)) <= length(conv_chain(max_rate))
        @test max_rate |> conv_chain == [
            Price("A", "F", 50.0),
            Price("E", "F", 1.0),
            Price("C", "E", 2.0),
            Price("C", "B", 2.0),
        ]

        @test min_rate |> conv_value β‰ˆ 1.0
        @test conv_value(min_rate) < conv_value(a_star_rate)
        @test length(conv_chain(a_star_rate)) <= length(conv_chain(min_rate))
        @test min_rate |> conv_chain == [
            Price("A", "C", 4.0),
            Price("C", "E", 2.0),
            Price("E", "F", 1.0),
            Price("D", "F", 2.0),
            Price("B", "D", 4.0),
        ]
    end

    @testset "Test β„–4: F -> D" begin
        a_star_rate = conv_a_star(my_graph, "F", "D")
        max_rate = conv_max(my_graph, "F", "D")
        min_rate = conv_min(my_graph, "F", "D")

        @test max_rate |> conv_value β‰ˆ 4.0
        @test conv_value(a_star_rate) < conv_value(max_rate)
        @test length(conv_chain(a_star_rate)) <= length(conv_chain(max_rate))
        @test max_rate |> conv_chain == [
            Price("E", "F", 1.0),
            Price("C", "E", 2.0),
            Price("C", "B", 2.0),
            Price("B", "D", 4.0),
        ]

        @test min_rate |> conv_value β‰ˆ 0.16
        @test conv_value(min_rate) < conv_value(a_star_rate)
        @test length(conv_chain(a_star_rate)) <= length(conv_chain(min_rate))
        @test min_rate |> conv_chain == [
            Price("A", "F", 50.0),
            Price("A", "C", 4.0),
            Price("C", "D", 2.0),
        ]
    end

    @testset "Test β„–5: D -> B" begin
        a_star_rate = conv_a_star(my_graph, "D", "B")
        max_rate = conv_max(my_graph, "D", "B")
        min_rate = conv_min(my_graph, "D", "B")

        @test max_rate |> conv_value β‰ˆ 1.0
        @test conv_value(a_star_rate) < conv_value(max_rate)
        @test length(conv_chain(a_star_rate)) <= length(conv_chain(max_rate))
        @test max_rate |> conv_chain == [
            Price("C", "D", 2.0),
            Price("C", "B", 2.0),
        ]

        @test min_rate |> conv_value β‰ˆ 0.1
        @test conv_value(min_rate) < conv_value(a_star_rate)
        @test length(conv_chain(a_star_rate)) <= length(conv_chain(min_rate))
        @test min_rate |> conv_chain == [
            Price("C", "D", 2.0),
            Price("C", "E", 2.0),
            Price("E", "F", 1.0),
            Price("A", "F", 50.0),
            Price("A", "B", 5.0),
        ]
    end

    @testset "Test β„–6: A -> Y" begin
        max_rate = conv_max(my_graph, "A", "Y")
        min_rate = conv_min(my_graph, "A", "Y")

        @test max_rate |> conv_value |> isnan
        @test max_rate |> conv_chain |> isempty

        @test min_rate |> conv_value |> isnan
        @test min_rate |> conv_chain |> isempty
    end
end

Basic rules of the algorithm

The final conversion coefficient between two currencies A and B is determined by multiplying the weights of the edges corresponding to the laid path.

graph LR;
    A --> |W1| B;
    B --> |W2| C;
    C --> |W3| D;

    classDef julia_blue fill:#4063D8,stroke:#333,stroke-width:2px;
    classDef julia_green fill:#389826,stroke:#333,stroke-width:2px;
    classDef julia_red fill:#CB3C33,stroke:#333,stroke-width:2px;
    classDef julia_purple fill:#9558B2,stroke:#333,stroke-width:2px;
    classDef def_color fill:#eee,stroke:#ccc,stroke-width:2px;

    class A julia_blue;
    class B julia_red;
    class C julia_green;
    class D julia_purple;
using CcyConv

my_graph = FXGraph()

push!(my_graph, Price("A", "B", 1))
push!(my_graph, Price("B", "C", 2))
push!(my_graph, Price("C", "D", 3))

conv = your_alg(my_graph, "A", "D")

julia> conv_value(conv)
6.0

julia> conv_chain(conv)
3-element Vector{CcyConv.AbstractPrice}:
 Price("A", "B", 1.0)
 Price("B", "C", 2.0)
 Price("C", "D", 3.0)

If there is an edge from A to B with weight W in the graph, then it is assumed that an inverse edge from B to A with weight 1/W is implicitly specified unless otherwise stated (like B -> C and C -> B).

graph LR;
    A --> |1| B;
    B .-> |1| A;
    B --> |2| C;
    C --> |3| B;
    C --> |3| D;
    D .-> |1/3| C;

    classDef julia_blue fill:#4063D8,stroke:#333,stroke-width:2px;
    classDef julia_green fill:#389826,stroke:#333,stroke-width:2px;
    classDef julia_red fill:#CB3C33,stroke:#333,stroke-width:2px;
    classDef julia_purple fill:#9558B2,stroke:#333,stroke-width:2px;
    classDef def_color fill:#eee,stroke:#ccc,stroke-width:2px;

    class A julia_blue;
    class B julia_red;
    class C julia_green;
    class D julia_purple;
using CcyConv

my_graph = FXGraph()

push!(my_graph, Price("A", "B", 1))
push!(my_graph, Price("B", "C", 2))
push!(my_graph, Price("C", "B", 3))
push!(my_graph, Price("C", "D", 3))

conv = your_alg(my_graph, "D", "A")

julia> conv_value(conv)
1.0

julia> conv_chain(conv)
3-element Vector{CcyConv.AbstractPrice}:
 Price("C", "D", 3.0)
 Price("C", "B", 3.0)
 Price("A", "B", 1.0)

Thus, the algorithm must search for a path in all directions of the graph, even if the edges are explicitly specified only in one direction.

graph LR;
    A --> |2| B;
    B .-> |0.5| A;
    B --> |5| C;
    C .-> |0.2| B;
    B --> |10| D;
    D .-> |0.1| B;

    classDef julia_blue fill:#4063D8,stroke:#333,stroke-width:2px;
    classDef julia_green fill:#389826,stroke:#333,stroke-width:2px;
    classDef julia_red fill:#CB3C33,stroke:#333,stroke-width:2px;
    classDef julia_purple fill:#9558B2,stroke:#333,stroke-width:2px;
    classDef def_color fill:#eee,stroke:#ccc,stroke-width:2px;

    class A julia_blue;
    class B julia_red;
    class C julia_green;
    class D julia_purple;
using CcyConv

my_graph = FXGraph()

push!(my_graph, Price("A", "B", 2))
push!(my_graph, Price("B", "C", 5))
push!(my_graph, Price("B", "D", 10))

# Test 1
conv = your_alg(my_graph, "A", "C")

julia> conv_value(conv)
10.0

julia> conv_chain(conv)
2-element Vector{CcyConv.AbstractPrice}:
 Price("A", "B", 2.0)
 Price("B", "C", 5.0)

# Test 2
conv = your_alg(my_graph, "D", "A")

julia> conv_value(conv)
0.05

julia> conv_chain(conv)
2-element Vector{CcyConv.AbstractPrice}:
 Price("B", "D", 10.0)
 Price("A", "B", 2.0)

Algorithm requirements

The algorithm must be able to find a path in which the final conversion coefficient will be maximum (We want to sell the base currency and get as much of the quoted currency as possible). Note that such a path may be longer than the path that would be found using the A* algorithm.

graph LR;
    A --> |2| B;
    B .-> |0.5| A;
    B --> |5| C;
    C .-> |0.2| B;
    A --> |5| C;
    C .-> |0.2| A;
    C --> |4| D;
    D .-> |0.25| C;

    classDef julia_blue fill:#4063D8,stroke:#333,stroke-width:2px;
    classDef julia_green fill:#389826,stroke:#333,stroke-width:2px;
    classDef julia_red fill:#CB3C33,stroke:#333,stroke-width:2px;
    classDef julia_purple fill:#9558B2,stroke:#333,stroke-width:2px;
    classDef def_color fill:#eee,stroke:#ccc,stroke-width:2px;

    class A julia_blue;
    class B julia_red;
    class C julia_green;
    class D julia_purple;
using CcyConv

my_graph = FXGraph()

push!(my_graph, Price("A", "B", 2))
push!(my_graph, Price("B", "C", 5))
push!(my_graph, Price("A", "C", 5))
push!(my_graph, Price("C", "D", 4))

# A* algorithm
conv = conv_a_star(my_graph, "A", "D")

julia> conv_value(conv)
20.0

julia> conv_chain(conv)
2-element Vector{CcyConv.AbstractPrice}:
 Price("A", "C", 5.0)
 Price("C", "D", 4.0)

# Your maximum algorithm
conv = your_max_alg(my_graph, "A", "D")

julia> conv_value(conv)
40.0

julia> conv_chain(conv)
3-element Vector{CcyConv.AbstractPrice}:
 Price("A", "B", 2.0)
 Price("B", "C", 5.0)
 Price("C", "D", 4.0)

The algorithm must also be able to find a path in which the final conversion rate will be minimal (We want to buy as much of the base currency as possible and spend the minimum amount of the quoted currency). Note that such a path may be longer than the path that would be found using the A* algorithm.

graph LR;
    A --> |2| B;
    B .-> |0.5| A;
    B --> |2| C;
    C .-> |0.5| B;
    A --> |5| C;
    C .-> |0.2| A;
    C --> |4| D;
    D .-> |0.25| C;

    classDef julia_blue fill:#4063D8,stroke:#333,stroke-width:2px;
    classDef julia_green fill:#389826,stroke:#333,stroke-width:2px;
    classDef julia_red fill:#CB3C33,stroke:#333,stroke-width:2px;
    classDef julia_purple fill:#9558B2,stroke:#333,stroke-width:2px;
    classDef def_color fill:#eee,stroke:#ccc,stroke-width:2px;

    class A julia_blue;
    class B julia_red;
    class C julia_green;
    class D julia_purple;
using CcyConv

my_graph = FXGraph()

push!(my_graph, Price("A", "B", 2))
push!(my_graph, Price("B", "C", 2))
push!(my_graph, Price("A", "C", 5))
push!(my_graph, Price("C", "D", 4))

# A* algorithm
conv = conv_a_star(my_graph, "A", "D")

julia> conv_value(conv)
20.0

julia> conv_chain(conv)
2-element Vector{CcyConv.AbstractPrice}:
 Price("A", "C", 5.0)
 Price("C", "D", 4.0)

# Your minimum algorithm
conv = your_min_alg(my_graph, "A", "D")

julia> conv_value(conv)
16.0

julia> conv_chain(conv)
3-element Vector{CcyConv.AbstractPrice}:
 Price("A", "B", 2.0)
 Price("B", "C", 2.0)
 Price("C", "D", 4.0)

For simplicity, you can break this algorithm into two separate ones to find the maximum and minimum. Or you can implement algorithm selection using keywords

Possible difficulties

Situations may arise with so-called β€œnegative cycles” - a closed path, following which we can endlessly improve the final conversion coefficient. You need to figure out how to avoid this behavior. Here A -> B -> C -> A -> ... is a negative cycle.

graph LR;
    A --> |2| B;
    B .-> |0.5| A;
    B --> |5| C;
    C .-> |0.2| B;
    A --> |5| C;
    C .-> |0.2| A;
    C --> |4| D;
    D .-> |0.25| C;

    classDef julia_blue fill:#4063D8,stroke:#333,stroke-width:2px;
    classDef julia_green fill:#389826,stroke:#333,stroke-width:2px;
    classDef julia_red fill:#CB3C33,stroke:#333,stroke-width:2px;
    classDef julia_purple fill:#9558B2,stroke:#333,stroke-width:2px;
    classDef def_color fill:#eee,stroke:#ccc,stroke-width:2px;

    class A julia_blue;
    class B julia_red;
    class C julia_green;
    class D julia_purple;
7uperior commented 2 months ago

Q1. Have you seen/heard of an implementation of something like this? I mean, with all restrictions of task :smiling_face_with_tear:

Tough task. Especially compared to what you have now, it's a very big leap forward I have some ideas for this thing

gryumov commented 2 months ago

@7uperior hi, yes, we have a solution πŸ˜‰

RongkunWang commented 1 month ago
graph LR;
    X --> |42| Y;
    Y --> |24| X;

    classDef julia_blue fill:#4063D8,stroke:#333,stroke-width:2px;
    classDef julia_green fill:#389826,stroke:#333,stroke-width:2px;
    classDef julia_red fill:#CB3C33,stroke:#333,stroke-width:2px;
    classDef julia_purple fill:#9558B2,stroke:#333,stroke-width:2px;
    classDef def_color fill:#eee,stroke:#ccc,stroke-width:2px;

    class A,F julia_blue;
    class B,E julia_red;
    class C,X julia_green;
    class D,Y julia_purple;

Just out of curiosity, if X-->Y has weight w but Y-->X doesn't have 1/w, but some weight w2 > 1/w, shouldn't the best rate to buy X from Y be w2 itself, rather than using a rate of 1/w, which reflects the consideration of direction of path X-->Y in the question? Or is it implicitly implied that w2 < 1/w, given all the gas/extra fees?

In this case wouldn't it be effective to only consider maximum weight. Since when you consider buying X from Y, instead of considering minimum weight X-->Y, you shall calculate maximum of Y-->X instead?

If the assumption of 2w<1/w is true, and we formulate the problem this way naturally we shall bypass the difficult of dead loop.

artememelin commented 3 weeks ago

Hi, @RongkunWang!

I didn't quite understand what you mean, so I would be very grateful for a more specific example However, as I understood it, the question is what to do if the path A --> B (with weight W1) has an explicitly defined reverse path B --> A (with weight W2). In this case, we assume that there is no implicit path B --> A with weight 1/W1 (only W2)

So we need a different mechanism to handle dead loops

RongkunWang commented 2 weeks ago

Hi @artememelin ,

I understood the original question formation. I'm just making a connection to the real-world problem and try to understand if the formation of the question makes sense.

If I can exchange crypto A to B with price W1, but I can change from B to A with price W2. If W1*W2 > 1, in real world we should simply just buy B with A and buy A back with B indefinitely until W1*W2 < 1, so as to make money. So in this case it actually encourages dead loops, not discourages it.. And if W1*W2< 1 is always true, we have the maximum gain we don't need to consider minimum weight path, we only need to consider maximum weight path. For example to from A to B, the most econimic price is by maximizing W_AB, or W_AC * W_CB, from B to A, we should maximize W_BA or W_BC * W_CA, but not minimize W_AB, or W_AC * W_CB?

artememelin commented 1 week ago

@RongkunWang, of course, what you suggest is not far from the truth.

Indeed, if we assume that in the real world we are unlikely to see a similar situation, then the task becomes quite simplified. Perhaps, for cases where we know for sure that there are no dead loops in the graph, we can implement the version of the algorithm you propose. In case of such a limitation, we can implement a more optimized algorithm for such graphs.

However, we also need an algorithm that will be able to work with dead loops in the graph