AndRossi / Kelpie

XAI framework for interpreting Link Predictions on Knowledge Graphs
42 stars 11 forks source link

Minor availability issues #3

Closed Shamazo closed 2 years ago

Shamazo commented 2 years ago

Hi,

Thank you for the very detailed and self-contained scripts for availability/reproducibility. There are a couple of issues I have run into:

1) There are absolute paths in reproducibility_generate_pdf.sh which depend on the location of the cloned repo

2) After fixing the absolute paths, there is an issue with generate_pdf_report.py while running reproducibility_generate_pdf.sh the backtrace is:

Traceback (most recent call last):
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 523, in open_for_read
    return open_for_read_by_name(name,mode)
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 463, in open_for_read_by_name
    return open(name,mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/nicholso/tmp/Kelpie/reproducibility_images/extraction_times_with_and_without_prefilter_plot.png'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 530, in open_for_read
    return BytesIO((datareader if name[:5].lower()=='data:' else rlUrlRead)(name))
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 476, in rlUrlRead
    return urlopen(name).read()
  File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 509, in open
    req = Request(fullurl, data)
  File "/usr/lib/python3.8/urllib/request.py", line 328, in __init__
    self.full_url = url
  File "/usr/lib/python3.8/urllib/request.py", line 354, in full_url
    self._parse()
  File "/usr/lib/python3.8/urllib/request.py", line 383, in _parse
    raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: '/home/nicholso/tmp/Kelpie/reproducibility_images/extraction_times_with_and_without_prefilter_plot.png'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 643, in __init__
    fp = open_for_read(fileName,'b')
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 532, in open_for_read
    raise IOError('Cannot open resource "%s"' % name)
OSError: Cannot open resource "/home/nicholso/tmp/Kelpie/reproducibility_images/extraction_times_with_and_without_prefilter_plot.png"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "generate_pdf_report.py", line 191, in <module>
    generate_pdf()
  File "generate_pdf_report.py", line 187, in generate_pdf
    doc.build(flowables)
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/platypus/doctemplate.py", line 1317, in build
    BaseDocTemplate.build(self,flowables, canvasmaker=canvasmaker)
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/platypus/doctemplate.py", line 1082, in build
    self.handle_flowable(flowables)
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/platypus/doctemplate.py", line 931, in handle_flowable
    if frame.add(f, canv, trySplit=self.allowSplitting):
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/platypus/frames.py", line 169, in _add
    w, h = flowable.wrap(aW, h)
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/platypus/flowables.py", line 511, in wrap
    return self.drawWidth, self.drawHeight
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/platypus/flowables.py", line 505, in __getattr__
    self._setup_inner()
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/platypus/flowables.py", line 462, in _setup_inner
    img = self._img
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/platypus/flowables.py", line 499, in __getattr__
    self._img = ImageReader(self._file)
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 655, in __init__
    annotateException('\nfileName=%r identity=%s'%(fileName,self.identity()))
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 1176, in annotateException
    rl_reraise(t,t(sep.join((_ for _ in (msg,str(v),postMsg) if _))),b)
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 138, in rl_reraise
    raise v.with_traceback(b)
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 643, in __init__
    fp = open_for_read(fileName,'b')
  File "/home/nicholso/.local/lib/python3.8/site-packages/reportlab/lib/utils.py", line 532, in open_for_read
    raise IOError('Cannot open resource "%s"' % name)
OSError:
fileName='/home/nicholso/tmp/Kelpie/reproducibility_images/extraction_times_with_and_without_prefilter_plot.png' identity=[ImageReader@0x7f3d4e4a66a0 filename='/home/nicholso/tmp/Kelpie/reproducibility_images/extraction_times_with_and_without_prefilter_plot.png'] Cannot open resource "/home/nicholso/tmp/Kelpie/reproducibility_images/extraction_times_with_and_without_prefilter_plot.png"

based on the contents of reproducibility_generate_pdf.sh, I think there is a missing call to the script which generates extraction_times_with_and_without_prefilter_plot.png

3) At the end of the running paper experiments section of the readme there is a minor typo in which script to run. I believe So after running the script, is is sufficient to re-run the PDF generation script reproducibility_environment.sh to obtain an up-to-date PDF report. should be So after running the script, is sufficient to re-run the PDF generation script sh reproducibility_generate_pdf.sh to obtain an up-to-date PDF report.

I am running this on Ubuntu 20.04 with python 3.8.10 in a clean venv, only installing packages through reproducibility_environment.sh. If you need any additional information to reproduce point 2 let me know.

Would you be able to take a look at these and update the scripts if necessary?

Thank you, Hamish

AndRossi commented 2 years ago

Hi Hamish,

Thank you for writing! I am sorry about those inconveniences - I should have definitely been more careful while writing reproducibility_generate_pdf.sh 😅

I have just pushed a fix that should correct all the three issues you highlight:

  1. I got rid of all absolute paths from reproducibility_generate_pdf.sh;
  2. I added the execution of plot_prefilter_vs_noprefilter_extraction_times.py to reproducibility_generate_pdf.sh;
  3. In a separate commit I updated the README.md replacing that occurrence of reproducibility_environment.sh with the correct script reproducibility_generate_pdf.sh;

With a fresh clone of the repository, the script reproducibility_generate_pdf.sh runs fine in my environment now. I think you can just pull the updates with git pull origin master and it should work well in your environment too.

Please let me know if it does, or if you find any other issues! I will be glad to help.

Best,

Andrea

Shamazo commented 2 years ago

Thank you Andrea, it all runs smoothly now :)