ncbo / ncbo_cron

Jobs that run on a regular basis in the NCBO infrastructure
Other
2 stars 6 forks source link

Spam deletion script broken in production #60

Closed jvendetti closed 1 year ago

jvendetti commented 1 year ago

The spam deletion script is executing nightly as scheduled, but doesn't appear to be deleting spam anymore.

I added a new account to the list of spam users in this commit, then ran the spam deletion script manually. The script output shows that the user ("buyadderallonline") and the ontology they uploaded (acronym ADDERALL) weren't deleted as expected:

[ncbo-deployer@ncbo-prd-app-31 ncbo_cron]$ bin/ncbo_spam_deletion
(LD) >> Using rdf store ncboprod-4store1:8080/sparql/
(LD) >> Using term search server at http://ncbo-prd-sol-01.stanford.edu:8983/solr/term_search_core1
(LD) >> Using property search server at http://ncbo-prd-sol-01.stanford.edu:8983/solr/prop_search_core1
(LD) >> Using HTTP Redis instance at ncbo-prd-rds-02.sunet:6380
(LD) >> Using Goo Redis instance at ncbo-prd-rds-03.sunet:6381
(AN) >> Using ANN Redis instance at ncbo-prd-rds-01.sunet:6379
(CNFG) >> OntologyRecommender not available, cannot load config
(CNFG) >> OntologiesAPI not available, cannot load config
(CR) >> Using Redis instance at localhost:6379
Processing details are logged to STDOUT
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1440    0  1440    0     0   5978      0 --:--:-- --:--:-- --:--:--  6000
I, [2022-12-12T14:39:11.544077 #14145]  INFO -- : No users/projects/notes/reviews/ontologies/provisional classes found
Completed removing SPAM
I, [2022-12-12T14:39:11.544146 #14145]  INFO -- : Completed removing SPAM

A BioPortal user reported the spam ontology, so I manually deleted it (and also the user account). However, there appear to be a couple of newer spam entries on the Projects page that could be used for testing, e.g. /projects/PAGOMU.

The scheduler-spam-deletion.log file shows no errors.

mdorf commented 1 year ago

Looks like our current Github Authorization token is invalid

mdorf commented 1 year ago

I added some additional handling to the script to fail with a corresponding error is anything other than a successful fetch of the SPAM user list occurs.