darshan-hpc / darshan

Darshan I/O characterization tool
Other
56 stars 27 forks source link

unable to create pdf file #967

Closed ashwinidr23 closed 9 months ago

ashwinidr23 commented 9 months ago

i have installed darshan runtime and darshan-util. i have generated log file but i am unable to run darshan-job-summary.pl as i encounter the following error. can you please let me know if there is anything i have missed? i have installed all the dependencies mentioned in the document like lastpage, subfigure, threeparttable etc

554_2.darshan /scratch/logdir1 Status: Generating summary for file 1 of 1:

Warning: empty y range [0:0], adjusting to [-1:1] Warning: empty y range [0:0], adjusting to [-1:1] LaTeX generation (phase1) failed [256], aborting summary creation. error log: or enter new name. (Default extension: sty)

Enter file name: ! Emergency stop. <read *>

l.5 \usepackage {lastpage}^^M ! ==> Fatal error occurred, no output PDF file produced! Transcript written on summary.log.

darshan-summary-per-file.sh done. Results can be found in /scratch/logdir1/*.pdf.

shanedsnyder commented 9 months ago

FWIW, we're not really supporting this older Perl-based job summary tool going forward, in part due to these sorts of dependency problems. The error reporting out of the latex tools here isn't always great either, but the best advice I can give is to make sure you have installed the texlive-latex-extra (at least that's what it is on Ubuntu systems) package with your package manager (e.g., apt-get install texlive-latex-extra).

I'd recommend trying to switch to our new Python-based analysis tool chain. Here are the docs for that, including details on how to install and generate similar job summary reports (now HTML-based): https://www.mcs.anl.gov/research/projects/darshan/docs/pydarshan/index.html. These tools already provide more info than the older Perl tool and will be the sole focus of any future improvements.

ashwinidr23 commented 9 months ago

Thank you for the prompt response, i installed pydarshan and i am able to create job summary reports. i wanted to know if there is a way to create a summary report from all the html files available at a particular location? what i am trying to accomplish is to study the IO utilization of a particular mount/directory. so i am getting the utilization reports of all the applications that use this mount. if there is already a way to summarize the IO utilization from the html files, please do let me know.

also i noticed few reports had "data Access by category" but few reports do not have this. This is very useful for my use case to determine the IO utilization by FS or mounts. anyway i can make this category available by default?

Thank you very much in advance!

shanedsnyder commented 9 months ago

Thank you for the prompt response, i installed pydarshan and i am able to create job summary reports. i wanted to know if there is a way to create a summary report from all the html files available at a particular location? what i am trying to accomplish is to study the IO utilization of a particular mount/directory. so i am getting the utilization reports of all the applications that use this mount. if there is already a way to summarize the IO utilization from the html files, please do let me know.

Unfortunately, this isn't something we have any tools for right now. I think there's definitely some wider spread interest from the Darshan team and users about having something like this, though. I won't likely have much time to work on this in the short term (new couple of months), but maybe something we can try to figure out later.

also i noticed few reports had "data Access by category" but few reports do not have this. This is very useful for my use case to determine the IO utilization by FS or mounts. anyway i can make this category available by default?

This table should be generated for all logs that have any data from POSIX or STDIO instrumentation modules. Can you confirm whether the logs in question have POSIX or STDIO data?

FWIW, if you're really interested in using this table specifically, you might be able to have a closer look at the code to see how it could be modified to iteratively update the table as you input logs one-by-one. I haven't looked at the code in awhile, but conceptually it seems like it wouldn't be too hard to feed it data from multiple logs (though this is not how it's used in our existing summary reports). We might not have the cycles to write the code for this, but could probably sanity check any code you come up with to see if it makes sense to upstream into PyDarshan.

ashwinidr23 commented 9 months ago

Unfortunately, this isn't something we have any tools for right now. I think there's definitely some wider spread interest from the Darshan team and users about having something like this, though. I won't likely have much time to work on this in the short term (new couple of months), but maybe something we can try to figure out later.

Thank you for providing this information. i understand that the development of such features may take time and i truly appreciate your efforts and consideration. i will keep any eye out for any updates in the future! for now, i would still be able to get some data by creating the human readable reports using pydarshan and then parse these reports for the data i need.

ashwinidr23 commented 9 months ago

FWIW, if you're really interested in using this table specifically, you might be able to have a closer look at the code to see how it could be modified to iteratively update the table as you input logs one-by-one. I haven't looked at the code in awhile, but conceptually it seems like it wouldn't be too hard to feed it data from multiple logs (though this is not how it's used in our existing summary reports). We might not have the cycles to write the code for this, but could probably sanity check any code you come up with to see if it makes sense to upstream into PyDarshan.

i was experimenting with IOR when i saw this. sometimes i got the "data Access by category" some times i did not. attaching one of the reports where i did not get the field for your reference. now that i am trying to reproduce on purpose, i am not able to :) perhaps i might have missed something.. but my reports are looking good now. Thanks once again!!

Darshan Summary Report.docx

shanedsnyder commented 9 months ago

No worries, happy to be of a little bit of help! If you are able to trip the issue you mention about logs missing the "data access by category more consistently, please let us know. Also, if you happen to still have a Darshan log for the example report you shared, that might be helpful to see if we could reproduce and maybe fix a bug.