CBDD / rDock

rDock is a fast and versatile Open Source docking program that can be used to dock small molecules against proteins and nucleic acids. It is designed for High Throughput Virtual Screening (HTVS) campaigns and Binding Mode prediction studies.
https://rdock.github.io
GNU Lesser General Public License v3.0
51 stars 22 forks source link

Check possible bug for file reading #113

Open ggutierrez-sunbright opened 5 months ago

ggutierrez-sunbright commented 5 months ago

It is possible that the RbtBaseFileSource::Read method has a bug where reading multi-record files with record delimiter at the beginning of the record (like MOL2) yields only the odd records (1st, 3rd, etc).

I didn't have the time to check properly, but the original code looks suspicious:

while ((m_fileIn.getline(m_szBuf, MAXLINELENGTH)) && (strncmp(m_szBuf, cszRecDelim, n) != 0));
while ((m_fileIn.getline(m_szBuf, MAXLINELENGTH)) && (strncmp(m_szBuf, cszRecDelim, n) != 0)) {
    m_lineRecs.push_back(m_szBuf);
}

To me it looks like it's going to

  1. skip the delimiter of record 1
  2. read all lines and skip the delimiter for record 2
  3. skip until it finds the delimiter of record 3
  4. read all lines from record 3 and skip the delimiter for record 4

and so on

it is possible that either I'm reading it wrong, or this issue doesn't affect the regular execution of rdock (it only reads one record for the receptor, which is the file that uses MOL2. molecules are provided in SDF which uses a delimiter at the end of each record)

there's been no complains about this in the last decades, so I think priority for this is quite low