aappleby / smhasher

Automatically exported from code.google.com/p/smhasher
2.63k stars 467 forks source link

ppc64le port of smhasher. #52

Open asowani opened 6 years ago

asowani commented 6 years ago

Hi,

I am working on porting smhasher to Power8/LE (ppc64le) platform. I have done necessary code changes and carried out tests to ensure that there are no regressions on ppc64le. I will create a PR soon for these changes. Requesting smhasher community to review the changes.

Thanks, Atul.

asowani commented 6 years ago

@aappleby Would like to know if this PR can be reviewed and integrated with master. Thanks!

asowani commented 6 years ago

@aappleby Is this project dormant now? If not, would like to know if this code will be considered for merging. Thanks!

raeburn commented 6 years ago

A lot of these changes look like they're not specific to ppc64le, but they're conditionalized as if they are.

Some look questionable -- for example, if you pass CityMurmur a size larger than 2GB on a 64-bit system (where size_t should be 64 bits), changing variable "l" to "signed int" may quietly get you bad results as it's either truncated or treated as negative.

In the md5 changes, if the idea is that it should consistently use 32-bit types for arithmetic manipulation, perhaps you should change the code to unconditionally use uint32_t?

The changes to the MurmurHash2 table in main.cpp don't look like a good idea to me... I think "x86" and "x64" designates different hashes intended to be fast on CPUs in different modes (word sizes). They're not great names in a world where people want to run code on many different architectures, especially if they want to get consistent results across architectures. In any case, labeling them both as "MurmurHash2 for ppc64le" fails to distinguish between them or give any clue what the difference is. Off the top of my head I don't know what a better name would be, but #ifdefs to rewrite just the name for every CPU we might want to use (just x86 and ppc64le now, but what about armv8 and s390 and...) seems like a bad road to start down.

Unfortunately I haven't got a ppc64 box handy at the moment, so I can't even check whether you were trying to fix compilation problems or incorrect output or what. But I am interested in portability changes around some of the murmur hash code...