Closed AnnaDearman closed 4 years ago
P.S. here is the zip file I uploaded to AWS. As usual I tried to upload as few files as possible from the zip file Celia e-mailed to me (things that may just be required on her Mac, things that could have been deleted, etc) so if you think I've deleted something important, let me know! I can always deploy it again with everything included. application_aws_v11.zip
Here are my bug-finding notes from an earlier deployed version, and Katie's notes about problems to solve, just so they're all in one place and I can close the older issues. I'm sure you've fixed many of them already. We can go through them and see whether they still need fixing or not:
I clicked on "Kinase" and entered "PRKACA". This failed, but when I went back in the browser and tried again, it worked. It returned "KAPCA_HUMAN" on the hits page, which is correct according to UniProt. I clicked on it and it failed, but worked after going back in the browser and trying again. All the "entry" info is correct according to UniProt. The amino acid sequence and MW look correct. It lists all phosphosites except two from the original source file from phosphosite.org (but they hadn't listed the gene so I would have omitted those rows anyway - I haven't checked the phospho.ELM original file). It says that SRC(Y419) is involved in 74 diseases but I can only see 9 rows in the diseases files! Are you searching for the number of occurrences of "Y419" in the whole file perhaps? I clicked on this phosphosite (it failed, but worked after going back in the browser and trying again) and it took me to the phosphosite ID page. The link to phosphosite.org works and it's the correct protein, neighbouring sequence, residue, MW. The UniProt IDs and entry names of the kinases are all correct. The diseases are all correct. I went back to the kinase page and looked at the inhibitors. They are all correct! I haven't checked the phosphosite or inhibitor searching or chromosome browsing properly yet (I believe the chromosome browsing doesn't work).
When I search for phosphosites by protein, and enter "H3F3A", I get four hits, but there should be 13:
T3, S10, T11, S28, S31, Y41, T45, S57, T58, T80, S86, T107, T118
They are all in the database (I checked using SQLiteStudio). I also searched for A2M and got no hits. I tried 53BP1 and got a subset of hits again.
I tried searching for phosphosites by "phosphorylated by" and it isn't working. I tried "PRKCD" and got no hits.
I found a bug that's my fault. When I add missing phosphosites to the long phosphosites table from the other tables, I'm losing some information along the way.
Phosphosites not seem to be working at all - page not loading up When you search for a phosphosite by protein, it's not returning all of them The issue of having to go back in the browser Including code to create error logs Make the website work with the analysis version integrated Speed up the code for the analysis - better way to query kianses Putting database seperate Uploading .tsv instead of .csv
I got Internal Server Error after I clicked the submit button.
I'll look at the error logs properly tomorrow. At first glance it looks like when I did my test and it tried to do "return render_template('datanalysis.html'..." it said "df1 not defined", and when Yutang did his test it said "Length mismatch: Expected axis has 86 elements, new values have 7 elements". Will think about it tomorrow. Here's the error log if anyone's keen to look. I did try to give people access to the AWS console so they could use it too but have so far been unsuccessful, it's complicated. errors.txt
The phosphosite pages load on my computer (on the deployed version, not just locally), as for the rest:
The data analysis worked perfectly the first time I tried it with the Ipatasertib.tsv file, I tried it a second time with the same document, and got an Internal Server Error, the closed the page and opened it again, and it couldn't even load the homepage.
I fixed the phosphosite hits issue for when you search for protein, I changed all the phosphosite queries to be by PHOS_ID instead of ID_PHOS, because PHOS_ID is in all tables involving phosphorites, while ID_PHOS is only in the phosphosites table, therefore it made queries tricky, and changed it to query the Phosphosites table only (I think before I was joining it to the KinasesPhosphosites table, which is unnecessary since all the information for the phosphosite hits page can be found on the phosphosite table, and we're filtering according to the phosphosite table). As so I have changed the application.py and phoshits.html, which I have attached here so you can try to run it with these? application.py.zip phoshits.html.zip
I tried the search of phosphosite with phosphorylated by by entering GRK6 and it worked, but didn't work with PRKCD, I will look into that
We are only allowing tsv files now, since the data analysis script is written to work with tsv files. I got the same error as Yutang when I tried with a csv file only (on my local computer).
Hi Celia,
Thanks for this. Changing the phosphosite ID has introduced a slight bug. I'd like to try to fix it, if I may, as I haven't used SQLAlchemy or Flask yet. You've worked so hard on this script and you should rest!
Thanks,
Anna
Hi all,
I've deployed the latest version of jacky. Here is the link for everyone to test it: http://jacky-env-01.ehym3crjpy.eu-west-2.elasticbeanstalk.com/ I initially had problems with the stats throwing an Internal Server Error. Also, our site complained when I uploaded az20 in a csv format, saying it wasn't csv or tsv.
I will note any more problems here, and see whether there's anything I can do to our configuration in AWS to make it work. I tried to change the "Breach Duration" from 5 to 1 ("the amount of time a metric can exceed a threshold before triggering a scaling operation") and it seemed to be applying the new setting, but I checked and it's at 5 again...
Everything looked beautiful on my local machine, well done!
Kind regards,
Anna