ReliaSolve / Molprobity2

0 stars 0 forks source link

Error computing suiteness on 4qln for any output type #156

Closed russell-taylor closed 3 years ago

russell-taylor commented 3 years ago

There are no suites in the file. The original program string output has a single character (0x0a) in the output file and the report format has the following:

 all general case widths, power = 3.00
Found 0 complete suites derived from 0 entries
0 suites were triaged, leaving 0 assigned to bins
For all 0 suites: average suiteness== 0.000 (power==3.00)
     0 suites are  outliers
     0 suites have suiteness == 0    
     0 suites have suiteness >  0 <.1
     0 suites have suiteness >=.1 <.2
     0 suites have suiteness >=.2 <.3
     0 suites have suiteness >=.3 <.4
     0 suites have suiteness >=.4 <.5
     0 suites have suiteness >=.5 <.6
     0 suites have suiteness >=.6 <.7
     0 suites have suiteness >=.7 <.8
     0 suites have suiteness >=.8 <.9
     0 suites have suiteness >=.9    

The dangle input file is completely empty (no characters).

Joymaker commented 3 years ago

Here's a new one! How did YOU produce a dangle file?

mmtbx.mp_geo rna_backbone=True 4qln.pdb 1>4qln.dangle Sorry: Conflicting angle restraints:

  1. definition from: residue: G%rna3p_pur, conformer "A"
  2. definition from: residue: G%5END, conformer "C" atoms: "ATOM 487 C6 G A 21 .. C " "ATOM 481 N1 G A 21 .. N " "ATOM 482 C2 G A 21 .. C "
russell-taylor commented 3 years ago

Mine presumably crashes too, which explains why the input dangle file is completely empty. The old code handles that empty input file and spits out the above report. The new code says no residues on standard error and produces no output.

I guess it makes sense to produce no output for no input, but it looks like the code is still returning an error code in this code path. I think the PDB and CIF inputs now complain on standard error but return with code 0. I think that would be okay to do in this case as well, even though it does not strictly match the old output. (I still worry that it will kill a script somewhere.)

russell-taylor commented 3 years ago

On second thought, we want the empty input file to produce the same output file with the information table with its 0 information so that we don't break a general workflow that always parses for those outputs.

Given that, the CIF and PDB file readers should behave the same on inputs that have no RNA/DNA, spitting out the same table with 0 information.

Joymaker commented 3 years ago

4Q1N if viewed on the PDB website does NOT show an RNA backbone score at all. What is it doing in our sample?

My complaint about mp_geo was misguided, based on a typo, I was reading and typing (lowercase) 4QLN.

russell-taylor commented 3 years ago

It is there because it has an RNA Backbone measure in the PDB file. You can see that as the last row on the summary validation graph on https://www.rcsb.org/structure/4QLN

russell-taylor commented 3 years ago

However, the problem as stated is that the new program produces empty output on an empty .dangle file whereas the original produces a report that has values of 0 in it. That's the issue that should be addressed.

Separately, I need to see why mp_geo is failing on this test file (turns out to be a Sorry based on conflicting angle restraints).

Joymaker commented 3 years ago

Done. I write a warning message to stderr (not stdout) and produce output that matches the former behavior.

At the moment a CIF or PDB input file that contains no RNA residues produces a warning message and empty output. Are we happy with that?

russell-taylor commented 3 years ago

The behavior of the system when reading PDB and CIF files is new, so it does not necessarily have to match the existing behavior. I would prefer that it send the same kind of report it would have when run on dangle output, but it is not worth an hour of time changing. It would be worth ten minutes.