jjhelmus / nmrglue

A module for working with NMR data in Python
BSD 3-Clause "New" or "Revised" License
209 stars 86 forks source link

Test datasets: Alternative needed for the Google-Code archive #87

Open kaustubhmote opened 5 years ago

kaustubhmote commented 5 years ago

I am planning to push a PR (currently my master branch) that writes processed Bruker data back into a format that Topspin can read. This is a fairly big addition and I have included some basic tests to make sure everything works OK. For the new tests to work, I will need to push some additional test datasets (processed topspin data) and some changes to the conversion scripts.

Is there an alternative way of doing this and making them available to the CI, since Google-Code is no longer functional?

jjhelmus commented 5 years ago

How big of dataset do the tests require? If For data in the ~10-100 kilobyte range including the file directly in the repository in a data folder should be fine. Larger datasets could be stored in a nmrglue_data repository or as a release artifact which has a 2GB upper limit.

kaustubhmote commented 5 years ago

The additional datasets are themselves relatively small at ~2.5MB (out of which a single 2D is ~2MB). That is still bigger than the nmrglue codebase (!), but I am testing the ability to read and write processed 2D datasets from and into the bruker format which makes smaller sizes a bit difficult. For the current testing requirements, I will make smaller datasets to fit into the data folder tests in fileio and/or change some tests.

However, I think it would be great to have a way to update the test_data_v0.4-dev.zip on google-code archive and run the more extensive tests, atleast locally.

kaustubhmote commented 4 years ago

@jjhelmus and @JLVarjo I am unable to find the testing data for testing PR #120. the test_data_v04-dev.zip file from above is missing jcampdx datasets. Any idea where I can find these?

JLVarjo commented 4 years ago

Hi, I have it and @jjhelmus should have it, but it seems that this test data archive is not updated. I think this data is not more than few megabytes. Should I upload them to the nmrglue_data repo maybe?

kaustubhmote commented 4 years ago

Maybe you can email it to me for now? We probably should revisit how/where we store these test datasets. It might be best to avoid adding to the repo for now, as most of the datasets will be be larger than than current repo size.

JLVarjo commented 4 years ago

I agree - it would simplify pushing bug fixes if test data could be made instantly available as well. I'll get back to you next monday and e-mail the data.