Closed movermeyer closed 6 years ago
It is likely because the puzzle is old and we used a different scoring scheme at that time. We'll retire it soon. old puzzles were used to see how participants are performing over a long period of time.
We are actually about to release in upcoming days a brand new database system with new puzzles.
I ran a bit of an experiment.
Using Fiddler, I intercepted the response from /api/getPuzzlesByC&D/
and modified the puzzle I was given. This was the response after I modified it for the first puzzle of the "Heart And Muscles" category.
{
"_id": "597badf1cadefb63b8e53a76",
"sequence_id": 33,
"submitter": "Akash",
"sequence": ["-------------------------","-------------------------","-------------------------","-------------------------","-------------------------","-------------------------","-------------------------","-------------------------","-------------------------"],
"tree": "((((hg19,rheMac2),(mm9,rn4)),(bosTau4,(equCab2,canFam2))),(loxAfr3,dasNov2))",
"disease_link": "Heart and Muscles",
"difficulty": 1,
"category": "Heart and Muscles",
"motif_seq": ["GCAGGTGTGA", "TGCAGGTGTG", "TTGCAGGTGT", "TGGGGGTGGGGG", "GGTGGGGG", "GGGTGGGG", "TTGCAGGTGTGA", "AGGTGTGA", "TTGGGGGTGGGG"],
"par_score": [0,0,0,0,0,0,0,0],
"annotations": ["E2A(bHLH)/proBcell-E2A-ChIP-Seq(GSE21978)", "E2A(bHLH),near_PU.1/Bcell-PU.1-ChIP-Seq(GSE21512)", "HEB(bHLH)/mES-Heb-ChIP-Seq(GSE53233)", "KLF14(Zf)/HEK293-KLF14.GFP-ChIP-Seq(GSE58341)", "Maz(Zf)/HepG2-Maz-ChIP-Seq(GSE31477)", "Maz(Zf)/HepG2-Maz-ChIP-Seq(GSE31477)", "Slug(Zf)/Mesoderm-Snai2-ChIP-Seq(GSE61475)", "Tbx5(T-box)/HL1-Tbx5.biotin-ChIP-Seq(GSE21529)", "Zfp281(Zf)/ES-Zfp281-ChIP-Seq(GSE81042)"],
"highest_score": 28,
"location_offset": 214,
"end_offset": 235,
"gene": "15",
"failure_rate": 1
}
Note that I have modified the tree
to add more sequences (this puzzle normally has only 3), set the par_score
to 0, and made completely empty sequences.
This gave me a game that looked like:
When I hit submit, the server happily accepted my modified problem solution. You can see it saved the solution in OpenPhylo:
To me this proves that the server does not validate the solutions presented by users. A malicious user could use this to steal the top scores for each of the puzzles, by filling puzzle with many sequences that are all "TTTTTTT...." (Or whatever the optimal alignment is). Not only that, but these top scoring alignments would presumably pollute the alignments that the researchers receive, lowering the data quality and causing headache when they have to debug the data problem.
I tried to avoid causing damage in this experiment. I submitted a score of 0, which ideally would not be passed on as a valuable alignment.
I'm looking forward to seeing the new puzzles. Seems like its an exciting time for PHYLO, with all these changes happening!
good catch. I'll forward that to our dev. it should be fixed indeed. Thanks!
We have added the solution submission validation on the server side. Thank you for notifying us with this!
For example, the first puzzle of "Other diseases" has a top score of 360, which I believe is not possible.
Even if every base was a match, there are only 166 (8 20 + 23) bases in the puzzle. Unless I'm mistaken in my scoring, the maximum possible score is 166 * 2 matches = 332.
I suspect that this is a result of the server not verifying solutions that users submit.
The server should verify solutions to ensure that: