Closed cornzy closed 9 years ago
Ok, this has to be implemented!
Of course I would like to see a "Package Drone" scraper, but I guess this is just a fantasy ;-)
I would prefer the Nexus scraper: http://grepcode.com/file/repo1.maven.org/maven2/org.sonatype.nexus/nexus-core/2.11.0-02/org/sonatype/nexus/proxy/maven/routing/internal/scrape/NexusScraper.java?av=f
It only requires the "presence" of this ".meta" file and not some http server parameter. And I guess there is no harm in just creating another XML file during the channel aggregation for maven.
You're right the nexus scraper would be nicer. I thought it would be more complex.
I didn't fully check it, but from a first look is seems to be even easier. Since you don't have to fake a Server header. Both variants need to create some sort of content (Apache the index.html and Nexus the ".meta" file). A dedicated XML file seems better to me and there is no need to reply with an altered server name then.
On the other side there would be no "index.html" then, which would be interesting for browsing the channel. Maybe we should do both :wink:
But for the Scraper part, I tend to go with the Nexus Scraper.
So the next step would be to actually provide an index file. Which the nexus scraper still needs in addition.
Is there are way you can test this with a local development environment? I did make a test with Nexus OSS, and it shows the directory index. But I am not sure how I can do a real test.
Maybe I could test tomorrow - if I'll find some time. If you have admin access to any nexus installation you could simply add a proxy repository to your local package drone - assumed that the nexus has network access to your computer.
Ok, well that is what I did. Set up an new Nexus (since we don't use it at all). Add a new proxy repo to a package drone channel. Check the "remote content" (I think it was) and saw the directory structure.
Is that it?
Sounds good. I also just configured a repository and could browse the package drone repository in nexus. Also important indication is the tab "Routing" where you'll find the "Discovery Status": Successful.
For any reason my "Browse Index" is still empty. Maybe there is another issue to fix.
I recognized this as well. So there is something missing?
Ok, I just checked with the "Central" proxy repository in the Nexus default setup. Also there the "Browse Index" tab is empty.
Then let's close this issue.
Ok!
If you like to proxy a pdrone-mave-repository in a nexus it is not possible to browse the pdrone-repository. In nexus configuration (tab Routing) you get the message "No scraper was able to scrape remote (or remote prevents scraping)."
Nexus has a set of scrapers that looks for familliar index formats e.g. an index file from apache server would start with the text "Apache" :(
http://grepcode.com/file/repo1.maven.org/maven2/org.sonatype.nexus/nexus-core/2.11.0-02/org/sonatype/nexus/proxy/maven/routing/internal/scrape/HttpdIndexScraper.java#HttpdIndexScraper
Maybe pdrone could either fake a familliar index page or maybe it is possible to extend the nexus scrapers with own written "pdron-scraper".