It needs documentation on how it works and the details on what is required to make it useful.
Such as:
How it works - for the nodes which match its content rules, it makes an anonymous request to "http://localhost:4502" to the matching content path. This means it does not use HTTPS and it expects the content to be available without any authentication (anonymously) on the local server.
It assumes it's running on a Publish server (where content is available anonymously). This itself is a little odd since by default Publish servers run on port 4503, not 4502. The work-around is to setup a proxy on 4502 to point to the port address of whatever port the Publish server is running on (typically 4503).
There is no way to change the protocol to https from http. There is no way to change the port from 4502 to 4503. There is no way to change the host name from "localhost" to anything else. [These can all be work-around using a proxy.]
It has limitations and they should be noted in the documentation.
It takes the response from http request and stores the response in the file system in the directory specified in the Agent configuration.
How to make it useful - Question from customer which is not answered is how to get this agent to run for the content they want. Assuming they don't want all the content in their system stored on the file system, but only a subset, how does the customer setup that? What calls, or triggers, this agent? This needs to be documented. You will have to ask the Engineers who wrote it to answer that question b/c I don't have that answer either.
Issue in ./help/sites-deploying/replication.md
It needs documentation on how it works and the details on what is required to make it useful. Such as: How it works - for the nodes which match its content rules, it makes an anonymous request to "http://localhost:4502" to the matching content path. This means it does not use HTTPS and it expects the content to be available without any authentication (anonymously) on the local server. It assumes it's running on a Publish server (where content is available anonymously). This itself is a little odd since by default Publish servers run on port 4503, not 4502. The work-around is to setup a proxy on 4502 to point to the port address of whatever port the Publish server is running on (typically 4503). There is no way to change the protocol to https from http. There is no way to change the port from 4502 to 4503. There is no way to change the host name from "localhost" to anything else. [These can all be work-around using a proxy.] It has limitations and they should be noted in the documentation. It takes the response from http request and stores the response in the file system in the directory specified in the Agent configuration.
How to make it useful - Question from customer which is not answered is how to get this agent to run for the content they want. Assuming they don't want all the content in their system stored on the file system, but only a subset, how does the customer setup that? What calls, or triggers, this agent? This needs to be documented. You will have to ask the Engineers who wrote it to answer that question b/c I don't have that answer either.