gtDMMB / RNAStructViz

Visualization, comparison, and analysis of RNA secondary structures via a cross-platform GUI
https://github.com/gtDMMB/RNAStructViz/wiki
GNU General Public License v3.0
17 stars 5 forks source link

"Raw sequence data" not correct #92

Closed ceheitsch closed 4 years ago

ceheitsch commented 4 years ago

Have noticed that the "Raw Sequence Data" is not necessarily correct. Am attaching a ct file. The ct file viewer lists this sequence: GGCGGCGU AGCUCAGC UGGCUAGA GCAUGCGG UUCAUACC CGCAGUGU CCGGGGUU CGAGUCCC It's missing the last 13 nucleotides. The full correct sequence is: GGCGGCGUAGCUCAGCUGGCUAGAGCAUGCGGUUCAUACCCGCAGUGUCCGGGGUUCGAGUCCCUGCGCCGCCACCA

CP000141_af.nopct.txt

maxieds commented 4 years ago

@ceheitsch There is (was) a bug that prevented the last bases not comprising a full 8 block from getting appended to the end of the buffer. In general, let's cap the buffer size at 4096 bases in total, and after that expect that the last characters will be given by "...".

Here's a display of the new sequence data you posted above at 10 chars per block:

Screen Shot 2020-03-30 at 1 28 38 PM

ceheitsch commented 4 years ago

Buffer cap makes sense. Llower might be better; 4096 seems like a lot. How about 2048?

ceheitsch commented 4 years ago

The last 3 nucleotides are now appearing --- but they're not correct. See screenshot and original dot file.

Screen Shot 2020-05-18 at 11 32 18 AM

o.nivara_region_01.dot.txt

maxieds commented 4 years ago

@ceheitsch This issue should be fixed in the forthcoming release v2.3.7-testing:

Screen Shot 2020-05-19 at 11 38 57 AM

ceheitsch commented 4 years ago

Unfortunately, the image that you posted clearly shows that while the last 3 bases are now correct, the first 10 have gone missing! Before asking me to test & close an issue, please carefully check the correctness yourself. Also, since this has taken 2+ iterations to get fix, please propose some further testing to verify the correctness. Thanks.

maxieds commented 4 years ago

@ceheitsch RE: Testing this feature propositions: This is unfortunately a much more complicated issue than we probably want to address based on my mistake with this corner case. Given that these sequences to print in the CT viewer window are generated on-the-fly based on GUI interactions, exposing the functionality involved to test this routinely, say with unit tests that we haven't so far gone to the trouble to add to RNAStructViz, is way more work than I think is necessary to resolve the issue that you noticed (missing corner case in the previously proposed solution).

In general, one would use something like a so-called "mock" library that would allow us to embed a sequence of GUI actions into a static test, like a unit test. I do not believe such a library exists for FLTK. And, moreover, if it did, I would expect based on my experience with FLTK that it would be very, very (very) painful to work with.

My solution is as follows: I will make sure to re-test the corner cases with both original sequences before I write back and suggest that a fix for this issue has been added to the newest release.

maxieds commented 4 years ago

@ceheitsch Here is my proposed fix, tested on both of the corner cases implicit to the above two troublesome examples:

Example 1

> CP000141_af.nopct
GGCGGCGUAG CUCAGCUGGC UAGAGCAUGC 
GGUUCAUACC CGCAGUGUCC GGGGUUCGAG 
UCCCUGCGCC GCCACCA

Screen Shot 2020-05-28 at 6 46 19 AM

Example 2

> o.nivara_region_01.dot
GGGGAUAUAG CUCAGUUGGU AGAGCUCCGC 
UCUUGCAAGG CGGAUGUCAG CGGUUCGAGU 
CCGCUUAUCU CCA

Screen Shot 2020-05-28 at 6 46 30 AM

ceheitsch commented 4 years ago

Want to be sure that I'm understanding the situation, so let's discuss in meeting on Wed.

ceheitsch commented 4 years ago

Happily done with this.