Closed DSuveges closed 7 months ago
Hi @DSuveges Just checked and indeed there are empty files. Dont know the reason for now and therefore would need investigation as the process runs without any problem.
Thank you @tsantosh7 for jumping onthe issue so quickly!
Hi @DSuveges
There was a bug in the abstract pipeline which is fixed now. I have rerun the pipeline for all the empty directories and updated the daily pipeline. All are reflected in the google storage now. Please kindly check and let me know if its ok?
Thanks @tsantosh7, the data is coming in! We can close this ticket.
Since late September 2023, all output files from the abstract pile is empty:
The progression of the pipeline is followed up on slack. However there was no indication if the jobs were failing (except checking the file contents manually). One example output is here. It says:
This is the expected output:
gs://otar025-epmc/ml02/abstract/2024_01_19/NMP_patch-18-01-2024-35.jsonl
. No content:@tsantosh7 , could you please take a look? Also, please let us know if you need further details for the investigation. The full-text processing pipeline seemingly works fine.