ywangd / pybufrkit

Pure Python toolkit to work with WMO BUFR messages
http://pybufrkit.readthedocs.io/
MIT License
72 stars 28 forks source link

Test data #3

Closed hugovk closed 7 years ago

hugovk commented 7 years ago

I'm a member of the open-source Pillow imaging library for Python.

https://github.com/python-pillow/Pillow

I'm trying to increase test coverage of the 20-year-old-plus code base by adding unit tests for more file formats.

Pillow has very limited BUFR support: only a stub that can recognise the format. Read and write would require an extra handler. Nevertheless, I'd like to test what is there.

Would it be possible to use and redistribute a BUFR file from https://github.com/ywangd/pybufrkit/tree/master/tests/data/ (or benchmark_data/) in the Pillow codebase as part of the test suite?

Pillow uses an open source PIL Software License: https://github.com/python-pillow/Pillow/blob/master/LICENSE

Thank you!

ywangd commented 7 years ago

That is actually an excellent question and I haven't thought about it. In fact, most of the test data are taken from ecmwf's BUFRDC and a few of them from ecCodes. I just grabbed them without thinking too much. Those two software uses Apache License.

However I am not sure if this License can be extended to the data. Because as far as I can tell, the data files are real data taken from daily operation (most of them also generated by ecmwf). So it seems to me that the original license of these data should apply instead of the Apache License. I don't know what is the original license and some of them could be public as requested by WMO.

This is a tricky question and I am afraid that I can neither confirm nor deny your request. :)

hugovk commented 7 years ago

Thanks, at least knowing the source helps :)

Do you know somewhere that has free-to-use BUFR files (and/or GRIB files)?

Alternatively, perhaps I could use pybufrkit to generate a dummy file :) What sort of JSON do you recommend to create a small, simple file using, say, pybufrkit encode JSON_FILE BUFR_FILE? I only really need the most basic file with the header.

Thanks again!

ywangd commented 7 years ago

Not sure where you can find free BUFR/GRIB files. Of course you could just use pybufrkit to generate BUFR files. In fact I created one such file this way https://github.com/ywangd/pybufrkit/blob/master/tests/data/contrived.bufr You can use this file freely.

It could be quite error prone trying to prepare a JSON file for encoding by hand. I would recommend that you start from the output of pybufrkit decode -j SOME_BUFR_FILE. Then edit the output JSON to you need.

Here is another free one that I modified from an upper air observation data.

[["BUFR",0,4],[0,0,9,0,0,false,"0000000",2,4,0,18,0,2017,1,1,0,0,5],[0,"00000000",1,true,false,"000000",[309052]],[0,"00000000",[[15,325,"SOME STATION",80,4,8,7,18,2017,1,1,0,0,5,25.3241,28.015,598,599,599,null,null,null,null,null,null,null,null,null,2,0,65536,100000,90,0,-1e-05,null,null,null,null,0,145472,94360,599,0,-1e-05,298.05,282.01,137,8.2,0]]],["7777"]]
hugovk commented 7 years ago

Thank you!