OpenWaterAnalytics / EPANET

The Water Distribution System Hydraulic and Water Quality Analysis Toolkit
MIT License
287 stars 206 forks source link

BUG: memory of network->Adjlist not handle correctly (memory leak or double free) #789

Open zannads opened 6 months ago

zannads commented 6 months ago

There is an error in the memory management of the variable network->Adjlist: the list is allocated but not freed. This bug appeared in my code, where I use EPANET only for the hydraulic simulations of the network in an optimization problem. As the possible decision variables in the optimization problem (Anytown) include tank and pipe installation, I use the ability of the library to modify the network structure dynamically to respond to the changes made by the optimization algorithm. The network nodes adjacency list is allocated during openH, but it is not freed in closeH. If we evaluate a new solution with a different number of nodes (e.g., we add a tank), the for loop in the function freeadjlists is using the wrong number of nodes, i.e., the new one (with the new tank) instead of the old one that was used to create the adjlist. The function freeadjlists and its counterparts buildadjlists and localadjlists use the variable Nnodes to manage the array of adjlists, which has size Nnodes+1. With the current version, these variables are only freed when closing a project (EN_close -> free data -> freeadjlists) or when a new hydraulic simulation is performed and the old one is deleted before starting a new simulation (EN_openH -> openhyd -> create sparse -> localadjlists -> freeadjlists). As freeadjlists iterates over the array of adjency lists using the current number of Nodes, if the number of nodes has changed since when the current adjlists have been created, it results in a memory leak (if the number of nodes is now less than before, then the last elements are never freed) of in a possible double free or some segmentation fault (if the number of nodes is now greater than before, then the for loop goes out of the boundaries of the netadjlists). This may also happen if the water quality simulations are run, as I don't see the function freeadjlists being called anywhere. See the example below.

The segmentation fault/double-free error is not easily replicable, as the compilers optimize the memory allocation and often when going out of boundary you end up with a zero value. However, the memory leak always happens if you keep removing nodes. I try to provide here an executable to show this error.

#include <iostream>
#include <string>

#include "epanet2_2.h"
#include "types.h"

void pe(int errcode, std::string msg = "") {
    if (errcode < 100) return;

    char err[EN_MAXMSG+1];
    EN_geterror(errcode, err, EN_MAXMSG);
    std::cerr << msg << std::endl << err << std::endl;
    throw std::runtime_error(err);
}

int main() {
    int errcode;
    EN_Project ph;
    pe(EN_createproject(&ph), "Error creating project");

    pe(EN_open(ph, "Net1.inp", "Net1.rpt", ""), "Error opening project");

    // Print number of nodes and links
    int nnodes, nlinks;
    pe(EN_getcount(ph, EN_NODECOUNT, &nnodes), "Error getting node count");
    pe(EN_getcount(ph, EN_LINKCOUNT, &nlinks), "Error getting link count");
    std::cout << "Number of nodes: " << nnodes << std::endl;
    std::cout << "Number of links: " << nlinks << std::endl;

    // Print netadjlist (should not exist)
    std::cout << "Address of the netadjlist before any simulation: " <<ph->network.Adjlist <<std::endl;

    // Solve (should work easily)
    pe(EN_solveH(ph), "Error solving hydraulics of the original network");
    // pe(EN_solveQ(ph), "Error solving qualitysim of the original network");
    std::cout << "Hydraulics solved with the original network." << std::endl;

    // Print netadjlit (this has been created in openH by buildadjlists)
    std::cout << "Address of the netadjlist after the simulation: " <<ph->network.Adjlist <<std::endl; 
    std::cout << "This netadjlits has " <<nnodes+1 <<" adjency lists" <<std::endl;

    // Add the new elements (tank and pipe, next to junction 12, with the same properties)
    std::cout << "Adding new elements" <<std::endl;
    int index;
    pe(EN_addnode(ph, "newNode", EN_TANK, &index), "Error adding new node");
    pe(EN_settankdata(ph, index, 710.0, 10, 10, 50, 25, 350, ""), "Error setting data of new tank");
    pe(EN_addlink(ph, "newLink", EN_PIPE, "22", "newNode", &index), "Error adding new link");
    pe(EN_setpipedata(ph, index, 10560, 12, 100, 0), "Error setting properties of new link");
    // Let's try also adding a duplicate pipe
    pe(EN_addlink(ph, "newDupLink", EN_PIPE, "12", "13", &index), "Error adding new link");
    pe(EN_setpipedata(ph, index, 5280, 12, 130, 0), "Error setting properties of new link");

    // Print number of nodes and links
    pe(EN_getcount(ph, EN_NODECOUNT, &nnodes), "Error getting node count");
    pe(EN_getcount(ph, EN_LINKCOUNT, &nlinks), "Error getting link count");
    std::cout << "Number of nodes: " << nnodes << std::endl;
    std::cout << "Number of links: " << nlinks << std::endl;

    // Print netadjlist (before it was not existing but for this simulation it already exists)
    std::cout << "Address of the netadjlist before the new simulation: " <<ph->network.Adjlist <<std::endl;
    std::cout << "This is the same as before adding the elements.\n";
    std::cout << "Now if the netadjlists is freed, it would free " <<nnodes+1 <<" adjency lits" <<std::endl;
    std::cout << "This is one more than it was before adding the tank.\nLuckily, thanks to the compiler, this usually points to null anyway and freeadjlists doesn't fail:" <<std::endl;
    std::cout << "Address of the netadjlist at Nnodes+1: " <<ph->network.Adjlist[nnodes+1] <<std::endl;

    // Solve (as written out to the console this doesn't fail because luckily the adjlists is initialiazed to zero anyway, 
    // however, if used for optimization, after many solutions evaluated this would not hold true).
    pe(EN_solveH(ph), "Error solving hydraulics of the extended network");
    // pe(EN_solveQ(ph), "Error solving qualitysim of the extended network");
    std::cout << "Hydraulics solved with the extended network." << std::endl;

    // Print netadjlit (this has been created in openH by buildadjlists)
    std::cout << "The new address of the netadjlist: " <<ph->network.Adjlist <<std::endl;
    std::cout << "Now containing " <<nnodes+1 <<"adjency lists.\n";

    // Now remove the new node and link
    std::cout << "Removing the added elements.\n";
    pe(EN_getnodeindex(ph, "newNode", &index), "Error getting index of new node");
    pe(EN_deletenode(ph, index, EN_UNCONDITIONAL), "Error deleting new node");
    // The link is deleted automatically thanks to the unconditional flag

    // remove the dup link 
    pe(EN_getlinkindex(ph, "newDupLink", &index), "Error getting index of new link");
    pe(EN_deletelink(ph, index, EN_UNCONDITIONAL), "Error deleting new link");

    // Print number of nodes and links
    pe(EN_getcount(ph, EN_NODECOUNT, &nnodes), "Error getting node count");
    pe(EN_getcount(ph, EN_LINKCOUNT, &nlinks), "Error getting link count");
    std::cout << "Number of nodes: " << nnodes << std::endl;
    std::cout << "Number of links: " << nlinks << std::endl;

    // Print netadjlists 
    std::cout << "Address of the netadjlist before the new simulation: " <<ph->network.Adjlist <<std::endl;
    std::cout << "This is the same as before removing the elements.\n";
    std::cout << "Now if the netadjlists is freed, it would free " <<nnodes+1 <<" adjency lists." <<std::endl;
    std::cout << "This is one less than it was before removing the elements.\nMeaning that the extra elements would not be freed and a memory leak happens." <<std::endl;
    std::cout << "Address of the netadjlist at Nnodes+1: " <<ph->network.Adjlist[nnodes+1] <<std::endl;

    // let's save it to prove that it was not freed
    Padjlist last_adj_list = ph->network.Adjlist[nnodes+1];

    // Solve (here memory leaks happens)
    pe(EN_solveH(ph), "Error solving hydraulics of the original network");
    // pe(EN_solveQ(ph), "Error solving hydraulics of the original network");
    std::cout << "Hydraulics solved again with the original network." << std::endl;
    std::cout << "Address of the netadjlist after the last simulation: " <<ph->network.Adjlist <<std::endl;

    // Close 
    pe(EN_close(ph));
    std::cout << "Closing the project.. and netadjlist has now being freed.\nIndeed, the address is "<<ph->network.Adjlist <<std::endl;
    std::cout << "However, we still can see the last element from the second simulation " <<last_adj_list <<" and access it, e.g. it was for node with index " <<last_adj_list->node <<std::endl;

    // I can indeed also free it 
    std::cout << "Meaning I can also free it without consequences, while it should result in a double free error..." <<std::endl;
    free(last_adj_list);
    // free(last_adj_list); if you try to do it again you have a double free error. 

    return 0;
}

The same results are obtained by uncommenting the lines that call the quality solver.

zannads commented 6 months ago

My suggestion and the solution I applied in my local copy is to free the localadjlists in closehyd. As the function openQ checks for the presence of an adjlists and creates one if none is present, the only drawback would be that of a double unnecessary free of adjlists (but I guess it is not too much of problem since it is already done within localadjlists in the creation of the sparse matrix). This means that the same call to freeadjlists should also be done in closequal.

Alternatively, a more robust approach would consist of invalidating (and freeing) the adjencylists every time a new node or link is added to the network. This however, requires additional checks for the existence of the variable net->Adjlists every time is used. This approach is also more robust to other possible misuse of the library, e.g., adding nodes between a call to runH and nextH.

I will create a Pull request to solve this issue, where we can further discuss which approach to take (maybe both?).

LRossman commented 6 months ago

Good catch and nice analysis. I agree with your fix. The only (small) downside I see is that when a user runs a water quality analysis after running hydraulics the adjacency lists have to be re-created once again (which won't happen if quality is run concurrently with hydraulics).