amir-jakoby / crawler-commons

Automatically exported from code.google.com/p/crawler-commons
0 stars 0 forks source link

[Sitemaps] Add more JUnit tests #42

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Currently we have low unit testing coverage.

We should add more unit tests to cover more Sitemap Parser code.

Original issue reported on code.google.com by avrah...@gmail.com on 22 May 2014 at 5:13

GoogleCodeExporter commented 8 years ago
This issue is dependent on issue 41.

I have created more unit tests covering the following:
* GZ sitemaps.
* Different types of MediaTypes

Original comment by avrah...@gmail.com on 22 May 2014 at 5:19

Attachments:

GoogleCodeExporter commented 8 years ago
Please note that there are two unit tests which are currently under @Ignore

That is because our current code logic is wrong and so they will fail.

After committing patch for issue40, I will uncomment them and they should work.
These tests fail due to wrong detection of MediaType (Like looking at the file 
extension before looking at leading bytes, and like recognizing 
"application/compress" & "application/octet" as GZ formats although they 
aren't).

Original comment by avrah...@gmail.com on 22 May 2014 at 6:55

GoogleCodeExporter commented 8 years ago
I think Lewis's change (r128) to update to JUnit4 conventions broke this patch, 
as it doesn't cleanly apply. It would be great if you could regenerate against 
the current trunk, thanks!

Original comment by kkrugler...@transpac.com on 24 Jun 2014 at 3:03

GoogleCodeExporter commented 8 years ago
Sure, will do.

Original comment by avrah...@gmail.com on 24 Jun 2014 at 4:24

GoogleCodeExporter commented 8 years ago
Clean patch attached.

I have created more unit tests covering the following:
* GZ sitemaps.
* Different types of MediaTypes

Please note that there are two unit tests which are currently under @Ignore

That is because our current code logic is wrong and so they will fail.

After committing patch for issue40, I will uncomment them and they should work.
These tests fail due to wrong detection of MediaType (Like looking at the file 
extension before looking at leading bytes, and like recognizing 
"application/compress" & "application/octet" as GZ formats although they 
aren't).

Original comment by avrah...@gmail.com on 30 Jun 2014 at 6:02

Attachments:

GoogleCodeExporter commented 8 years ago
Avi, do you have the file xmlSitemap.gz available?
If you can attach it here I will commit this patch for you. Right now I cannot 
commit as the file is non-existent.
Thanks

Original comment by lewis.mc...@gmail.com on 1 Jul 2014 at 2:42

GoogleCodeExporter commented 8 years ago
Sure,

I was sure it will be included in the patch itself.

I am attaching it now

Original comment by avrah...@gmail.com on 1 Jul 2014 at 4:38

Attachments:

GoogleCodeExporter commented 8 years ago
Committed revision 130.
Good work Avi. Thank you for this patch.

Original comment by lewis.mc...@gmail.com on 1 Jul 2014 at 5:12