NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
44 stars 34 forks source link

[DOC] User doc tools should be clear about --filter_apps and also --filter-criteria #1055

Open viadea opened 1 month ago

viadea commented 1 month ago

Report incorrect documentation

Location of incorrect documentation https://docs.nvidia.com/spark-rapids/user-guide/latest/qualification/quickstart.html#command-options https://docs.nvidia.com/spark-rapids/user-guide/latest/qualification/jar-usage.html#deploying-tools-jar

Describe the problems or issues found in the documentation Python version has an option --filter_apps, -f. Jar version has another option -f, --filter-criteria <arg>

In fact, we can use jar version's option --filter-criteria in the python tool command line. We need to document it clearly that: a. In python tool's doc, what does -f mean? --filter_apps or --filter-criteria. b. In python tool's doc, can you give an example of both --filter_apps and --filter-criteria are used?

amahussein commented 1 month ago

Thanks @viadea !

Can I use --filter-criteria with the Python CLI cmd?

Actually, The python CLI allows argument forwarding to the Jar cmd. This is a feature that was hidden from the documentation based on the request of the product owners. I believe their intuition is to keep the documentation simple and keep this extra argument from internal usage only.

If you run spark_rapids qualification -- --help there will be in the last line

    Additional flags are accepted.
        A list of valid Qualification tool options.
        Note that the wrapper ignores ["output-directory", "platform"] flags, and it does not support
        multiple "spark-property" arguments. For more details on Qualification tool options, please visit
        https://docs.nvidia.com/spark-rapids/user-guide/latest/qualification/jar-usage.html#running-the-qualification-tool-standalone-on-spark-event-logs

For Example, in the python CLI, you can use --filter_criteria to be forwarded to the Jar cmd (note that the hyphen is becoming underscore)...

spark_rapids qualification   \
             --cluster $CPU_CLUSTER_PROPS   \
             --output_folder $LOCAL_OUTPUT_DIR   \
             --eventlogs $LOCAL_CPU_EVENT_LOGS   \
             --tools_jar $LOCAL_JAR   \
             --estimation_model XGBOOST   \
             --verbose \
             --filter_criteria 1-newest-filesystem

Then in the stdout, you can find that the python is passing --filter-criteria to the JAR cmd.

What is filter_apps?

This argument filters the results displayed in the STDOUT of the Python CLI.

filter_apps was a feature requested by the product owners to filter the results summary of the python STDOUT. For example, by default the STDOUT shows the applications from the TCL. Users interested in Cost-savings can override the behavior to show all the applications that can achieve a GPU cost savings. Users also can override that behavior to list all the applications (even with negative cost-savings/ or no speedups).

In summary, I doubt users are interested in this argument because everyone is consuming the CSV files. Mainly, it is used internally by our devs for Demo purposes.

Finally, the "SPEEDUPS" value will be dropped in the future because we are moving away from the legacy speedup.

   -f, --filter_apps=FILTER_APPS
        Type: Optional[str]
        Default: None
        filtering criteria of the applications listed in the final STDOUT table is one of the following (ALL, SPEEDUPS, SAVINGS, TOP_CANDIDATES). Requires "Cluster".

        Note that this filter does not affect the CSV report. "ALL" means no filter applied. "SPEEDUPS" lists all the apps that are either 'Recommended', or 'Strongly Recommended' based on speedups. "SAVINGS" lists all the apps that have positive estimated GPU savings except for the apps that are "Not Applicable". "TOP_CANDIDATES" lists all apps that have unsupported operators stage duration less than 25% of app duration and speedups greater than 1.3x.
amahussein commented 1 month ago

I agree with you that the two names are confusing. We can consider one of them to avoid confusion.