adsabs / ADSReferencePipeline

Pipeline to resolver reference (ie, match it with the record in ADS)
MIT License
0 stars 2 forks source link

Error processing SPIE references #3

Open ehenneken opened 6 months ago

ehenneken commented 6 months ago

Command issued: python3 run.py RESOLVE -p /proj/ads/references/sources/SPIE -e *.xml -d 60

Error generated:

Traceback (most recent call last):
  File "run.py", line 323, in <module>
    process_files(source_filenames)
  File "run.py", line 104, in process_files
    toREFs = parser(filename=filename, buffer=None)
  File "/app/adsrefpipe/refparsers/SPIExml.py", line 196, in __init__
    XMLtoREFs.__init__(self, filename, buffer, parsername=SPIEtoREFs, tag='ref', cleanup=self.block_cleanup)
  File "/app/adsrefpipe/refparsers/toREFs.py", line 401, in __init__
    pairs = self.get_references(filename=filename)
  File "/app/adsrefpipe/refparsers/toREFs.py", line 437, in get_references
    return self.get_reference_blob(buffer, self.detect_ref_format(buffer))
  File "/app/adsrefpipe/refparsers/toREFs.py", line 464, in get_reference_blob
    match = pattern.search(buffer)
AttributeError: 'NoneType' object has no attribute 'search'
golnazads commented 5 months ago

@ehenneken @aaccomazzi The reason this fails is because you are looking for most recent 60 day files, I checked several recent directories and found empty files. I found one non empty one which works. I never thought to consider a xml reference file be completely empty. Going to add a warning and quit in this case. But though you might want to investigate what has gone wrong.