nasa-jpl-memex / memex-explorer

Viewers for statistics and dashboarding of Domain Search Engine data
BSD 2-Clause "Simplified" License
121 stars 69 forks source link

Ache needs nonzero exit codes #583

Closed amfarrell closed 8 years ago

amfarrell commented 9 years ago

ACHE does not give nonzero error codes. Therefore, celery currently reports all ended tasks as successful, regardless of their log output. Instead, it should

preferably: grep the log or results for definitive indication of success and report error if it is absent at least: grep the log for indications of error and report error if any are found.

amfarrell commented 9 years ago

The following error results from starting ACHE being run with the seeds list found at source/test_resources/test_crawl_data/cats.seeds and the model found in source/test_resources/test_model

[2015-06-16 19:21:44,372] WARN main - Data output path already exists, deleting everything [2015-06-16 19:21:44,381] INFO main - CONFIGURATION FILE = /home/vagrant/resources/crawls/ache1_1/config/link_storage/link_storage.cfg [2015-06-16 19:21:44,616] INFO main - Number of seeds:7 [2015-06-16 19:21:44,629] INFO main - CONFIGURATION FILE = /home/vagrant/resources/crawls/ache1_1/config/link_storage/link_storage.cfg LINK_CLASSIFIER:class focusedCrawler.link.classifier.LinkClassifierBaseline [2015-06-16 19:21:44,649] INFO main - USE_SCOPE:false [2015-06-16 19:21:44,652] INFO main - FRONTIER: class focusedCrawler.link.frontier.FrontierTargetRepositoryBaseline

TOTAL LOADED: 7 [2015-06-16 19:21:44,664] INFO main - >> LOADING GRAPH... [2015-06-16 19:21:44,710] INFO main - >> DONE GRAPH. [2015-06-16 19:21:44,713] INFO main - CONFIGURATION FILE = /home/vagrant/resources/crawls/ache1_1/config/target_storage/target_storage.cfg [2015-06-16 19:21:44,717] INFO main - CONFIGURATION FILE = /home/vagrant/resources/models/1/pageclassifier.features [2015-06-16 19:21:45,578]ERROR main - Problem while starting crawler. java.lang.IllegalArgumentException: Attribute names are not unique! Causes: '?????????' '???the' at weka.core.Instances.(Instances.java:259) ~[weka-stable-3.6.10.jar:na] at focusedCrawler.target.TargetStorage.createClassifier(TargetStorage.java:342) ~[ache-0.1.0.jar:na] at focusedCrawler.target.TargetStorage.createTargetStorage(TargetStorage.java:277) ~[ache-0.1.0.jar:na] at focusedCrawler.Main.startCrawl(Main.java:126) [ache-0.1.0.jar:na] at focusedCrawler.Main.main(Main.java:30) [ache-0.1.0.jar:na] [2015-06-16 19:22:35,652] WARN main - Data output path already exists, deleting everything [2015-06-16 19:22:35,658] INFO main - CONFIGURATION FILE = /home/vagrant/resources/crawls/ache1_1/config/link_storage/link_storage.cfg [2015-06-16 19:22:35,873] INFO main - Number of seeds:7 [2015-06-16 19:22:35,878] INFO main - CONFIGURATION FILE = /home/vagrant/resources/crawls/ache1_1/config/link_storage/link_storage.cfg LINK_CLASSIFIER:class focusedCrawler.link.classifier.LinkClassifierBaseline [2015-06-16 19:22:35,886] INFO main - USE_SCOPE:false [2015-06-16 19:22:35,887] INFO main - FRONTIER: class focusedCrawler.link.frontier.FrontierTargetRepositoryBaseline TOTAL LOADED: 7 [2015-06-16 19:22:35,899] INFO main - >> LOADING GRAPH... [2015-06-16 19:22:35,957] INFO main - >> DONE GRAPH. [2015-06-16 19:22:35,958] INFO main - CONFIGURATION FILE = /home/vagrant/resources/crawls/ache1_1/config/target_storage/target_storage.cfg [2015-06-16 19:22:35,962] INFO main - CONFIGURATION FILE = /home/vagrant/resources/models/1/pageclassifier.features [2015-06-16 19:22:36,711]ERROR main - Problem while starting crawler. java.lang.IllegalArgumentException: Attribute names are not unique! Causes: '?????????' '???the' at weka.core.Instances.(Instances.java:259) ~[weka-stable-3.6.10.jar:na] at focusedCrawler.target.TargetStorage.createClassifier(TargetStorage.java:342) ~[ache-0.1.0.jar:na] at focusedCrawler.target.TargetStorage.createTargetStorage(TargetStorage.java:277) ~[ache-0.1.0.jar:na] at focusedCrawler.Main.startCrawl(Main.java:126) [ache-0.1.0.jar:na] at focusedCrawler.Main.main(Main.java:30) [ache-0.1.0.jar:na]

brittainhard commented 9 years ago

Not sure if this should be added to the ACHE repo, since the problem is with ACHE and not memex explorer.

ahmadia commented 9 years ago

Yes, we should raise an issue there.

amfarrell commented 8 years ago

@brittainhard Could I please be un-assigned from this issue?