datagovuk / ckanext-dgu

CKAN extension for data.gov.uk
http://data.gov.uk/
34 stars 33 forks source link

Organograms - displaying wrong publisher's organogram #487

Closed davidread closed 8 years ago

davidread commented 8 years ago

This URL should show BBSRC's organogram: https://test.data.gov.uk/organogram/biotechnology-and-biological-sciences-research-council/2011-09-30 However for some reason it is showing ACAS's one!

This is evidenced by the chief exec's role of "Acas Chief Exectuive", and confirmed by looking at the spreadsheets.

davidread commented 8 years ago

@ratajczak I wonder if it can't find the file it is trying to show? I have been mucking about with the filenames recently - changing underscores to dashes and back in the past week.

davidread commented 8 years ago

Something weird is going on. Now I look at this one: https://test.data.gov.uk/organogram/human-fertilisation-and-embryology-authority/2011-09-30 and it is showing BBSRC. Could there be a caching thing going on?

ratajczak commented 8 years ago

I just clicked this link and everything looks ok to me, did you see something different or am I missing something: image

Regarding BBSRC you mentioned at the beginning it should be this file: https://test.data.gov.uk/sites/default/files/organogram/uploads/biotechnology_and_biological_sciences_research_council-2011-09-30-organogram.xls

but I get 404 despite that this file is in the ../organogram/uploads/ folder. It looks like issue #480

I just coped this file to test.xls: cd /var/www/files/drupal/dgud7/organogram/uploads/ cp biotechnology_and_biological_sciences_research_council-2011-09-30-organogram.xls test.xls

and after renaming test.xls file downloads correctly: https://test.data.gov.uk/sites/default/files/organogram/uploads/test.xls

davidread commented 8 years ago

I believe that if your API can't find the XLS/CSV file then it returns whatever it last returned. In your case it was giving the wrong year's data for HFEA, because I was testing that most recently. For example, '13037' should have returned HFEA 2011-09 but it returns BBSRC:

curl 'https://test.data.gov.uk/organogram-ajax/preview/13037?_=1473754970005'
{"status":0,"data":{"name":"Biotechnology and Biological Sciences Research Council","value":"biotechnology_and_biological_sciences_research_council-2011-09-30-organogram"}}

Could you change this AJAX so that it returns 404 when it can't find the file.

It looks like a whole bunch of files are missing from the uploads directory, by the numbers.

co@prod2 /var/www/drupal/dgud7/current () $ ls ~/organogram-data/data/dgu/xls-from-triplestore |grep 2011-03 |wc -l
141
co@prod2 /var/www/drupal/dgud7/current () $ ls /var/www/files/drupal/dgud7/organogram/uploads/* |grep 2011-03 |wc -l
5

co@prod2 /var/www/drupal/dgud7/current () $ ls ~/organogram-data/data/dgu/xls-from-triplestore |wc -l
329
co@prod2 /var/www/drupal/dgud7/current () $ ls ~/organogram-data/data/dgu/xls |wc -l
831
co@prod2 /var/www/drupal/dgud7/current () $ ls /var/www/files/drupal/dgud7/organogram/uploads/* |wc -l
760

Maybe it's because I was regenerating the files at the same time as you loaded them in last weekend. Can you point me at your migration script and I can just check it is likely to work next time?

I've just copyied all the files in now:

co@prod2 /var/www/drupal/dgud7/current () $ cp ~/organogram-data/data/dgu/xls-from-triplestore/* /var/www/files/drupal/dgud7/organogram/uploads/
co@prod2 /var/www/drupal/dgud7/current () $ cp ~/organogram-data/data/dgu/xls/* /var/www/files/drupal/dgud7/organogram/uploads/

and it works fine now:

$ curl 'https://test.data.gov.uk/organogram-ajax/preview/13037?1234'
{"status":0,"data":{"name":"Human Fertilisation and Embryology Authority","value":"human_fertilisation_and_embryology_authority-2011-09-30-organogram"}}
ratajczak commented 8 years ago

Thanks for investigating, that's helpful. I'll find and fix why it does return previous file instead of 404 Also I'm not sure if the fact that I haven't updated tso_combined.csv before uploading matters. I'm going to clean everything up and re-upload everything again.

ratajczak commented 8 years ago

This is fixed and deployed. I'm not going to re-upload everything again.

davidread commented 8 years ago

Great, I'm assuming this is fixed now