Open TristanHehnen opened 4 years ago
@TristanHehnen Thanks. I have been working on building a module for the gas phase group. It can be found here. I suggest you start a similar module for matl-db called "matl.py" and add your classes to this module. Then, this module can live in a Utilities directory of the repo and be called using something similar to this (this is just a temporary script I'm using to build the module).
I'll have a look at it.
I would like to introduce to you the initial prototype for parsing the readme files. I would like to get some feedback such that I do not spend a lot of time creating something that is disliked/thrown away in the end.
For now the prototype consits of a script containing the individual functions and a Jupyter notebook that serves as a brief demo of the functionality. It doesn't follw the module proposal by Randy, i.e. "matl.py" yet, but it could certainly moved in this direction.
As an example, only the TGA data from UMET is processed in the demo. This is primarily due to the overhead that comes with adjusting the readme files. And if we have reached an agreement as to how they should look like, it would be easier to adjust the format automatically with the here developed script. That means, the readme files should be unified in that different laboratories provided different parameters to describe their experimental campaigns (e.g. different temperature programs, lids on the crucibles or ways to describe the crucibles, etc.). In my view, all these items should be used in all experiment descriptions consistantly. Items that are not relevant for the particular experiment in question should contain "None", as they do in the dictionary later on. The goal of having the laboratories fill out "None" consciously, is to reduce the chance to forget data.
Furthermore, I suggest to have the heating rates and initial sample masses written only in the "Test Condition Summary" table. They seem not to be too useful in the text section.
Since Isaac started to unify the data file names, I would like to ask if the label of the individual experiment is supposed to by the same as the data file name in general. This would be helpful to reduce the footprint of the summary table and I don't really see why there should be a difference in naming.
As for the structure of the dictionary, it is ordered by experiment --> institute --> repetition (rep. label/data file name) --> parameters (e.g. heating rate or initial sample mass), see the demo.
As further steps, I would set up functionality that translates the dictionary into the readme file format and saves it as README.md
, save the dictionary as human-readable Python script, set up the functionality to process all the other experiment types and finally provide the respective README.md
templates, such that new data sets can easily be integrated.
@TristanHehnen Thanks for all your work on this! I think it is headed in the right direction. But I am not the keeper for matl-db
(only a maintainer), so I think we need to get consensus from Isaac (@leventon) and Morgan (@mcb1).
As long as very simple instructions can be put together for the participants, I am in favor. I suggest you use Isaac or Morgan as a test case and see if they can follow the instructions.
I'll take a look at this more closely tomorrow. To be honest, my python experience is severely limited but I'll see what I can make of things. For now, I'll [comment] on some specific things that you wrote in order:
As an example, only the TGA data from UMET is processed in the demo. This is primarily due to the overhead that comes with adjusting the readme files. And if we have reached an agreement as to how they should look like, it would be easier to adjust the format automatically with the here developed script.
[Makes sense to me; As I've started working with data, these issues become more apparent. Even as things are currently... it took a surprising amount of manual effort to rename + reorganize files and edit READMEs into some level of consistency, as we have now.]
That means, the readme files should be unified in that different laboratories provided different parameters to describe their experimental campaigns (e.g. different temperature programs, lids on the crucibles or ways to describe the crucibles, etc.). In my view, all these items should be used in all experiment descriptions consistently. Items that are not relevant for the particular experiment in question should contain "None", as they do in the dictionary later on. The goal of having the laboratories fill out "None" consciously, is to reduce the chance to forget data.
[Improving on this template would help. Standardizing things also makes sense. Thankfully, we don't expect too much new data to come in. It would be great if README files were submitted consistently. Some labs were wonderful with that. Others not so much. One wrote a great, thorough, multi-page description of their tests: although this was great for understanding, it wasn't when it came time to write the README (and makes automated analysis more difficult)]
Furthermore, I suggest to have the heating rates and initial sample masses written only in the "Test Condition Summary" table. They seem not to be too useful in the text section.
[This is fine, but is it necessary/does it hurt as is? Effectively, the READMEs come from my edits of test descriptions provided by labs. Often, I would pull data from their written text to populate the summary tables but I wouldn't go back to delete the written summary included above each section. Keeping that info in the written summary maintains the flow/continuity of some group's descriptions as you read them]
Since Isaac started to unify the data file names, I would like to ask if the label of the individual experiment is supposed to by the same as the data file name in general. This would be helpful to reduce the footprint of the summary table and I don't really see why there should be a difference in naming.
[Yes, this should definitely be the case. Now, it's easier to catch which tests look different. When I started, I did my best to guess what info we would need in that file name so names evolved as more data came in. I made some incorrect guesses at the start. That said, I think filenames are all now set; editing READMEs for consistency with those filenames, as you suggest, is the right move and should be doable (without having to repeat the exercise again for future changes)]
On Tue, Jun 16, 2020 at 7:26 AM Randy McDermott notifications@github.com wrote:
@TristanHehnen https://github.com/TristanHehnen Thanks for all your work on this! I think it is headed in the right direction. But I am not the keeper for matl-db (only a maintainer), so I think we need to get consensus from Isaac (@leventon https://github.com/leventon) and Morgan (@mcb1 https://github.com/mcb1).
As long as very simple instructions can be put together for the participants, I am in favor. I suggest you use Isaac or Morgan as a test case and see if they can follow the instructions.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MaCFP/matl-db/issues/27#issuecomment-644703910, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADBUSGMZGPFSECPQPO4RZYDRW5JFJANCNFSM4LGQKNUQ .
Isaac Leventon, PhD National Institute of Standards and Technology Fire Research Division Building 224 - Room A265 Gaithersburg, MD 20899 (301) 975-6887
Guys, what about the idea of using a Google Form (or something) to submit the README data, which would get converted to a csv and then the scripts would generate the README.md file. In that way, you could have dropdowns where you want only specific answers.
That's probably a good idea. We might get a new data set from Chile in the next few months - that'd be a good trial run for the form, otherwise it'll at least be set for our next material. Either way, @TristanHehnen - after we settle on a final format of the readme, do you want to set up a test case with Google forms to work on a script to build a proper README file from there? Forms can output data as .csv files; parsing them should be straightforward but there might be tweaks needed to do that with your script given the different format of those files vs. our current README.md files.
Ideally, we'd also upload measurement data through google forms, but it looks like Forms requires users to have a google account to do that, so I'll likely just include a text notification reminding visitors to [submit data by email as a .zip file to email@server.com] when they submit the form.
As for your script - honestly, I'm out of my element here so we should wait for proper feedback from Morgan. The general concept / flow of what you have here makes sense but I can't really comment on the functionality, writing, or design of the code itself. As for general conceptual comments...
Default options: I am a little worried about defaulting to "None" for set options. In some cases, 'unknown' or 'not provided' would make more sense. (e.g., we should not default to "None" for crucible type/lid type because, in this case, None means that "it wasn't used" [instead of "this info wasn't provided by the user"])
Test Conditions Table: For the test conditions table, especially if we automate it as a form, I am not sure how we can switch that programmatically to allow for different field/header types for different experiments (e.g., TGA has certain settings, these are unique from Cone or FPA). Linking O2 concentration to the main carrier gas will require some thought as will having different inert gases and how we link this prescribed value to test label/file name.
Test Label and File Name: I believe you noted this already but Test Label and File Name are Redundant, so we should definitely collapse that into one field
Calibration type: This likely could be expanded to include more default field types. Below is an example just for TGA; we'd likely want further thought to provide options for other test types Calibration type: (mass, heat flow, temperature) Calibration Temperature Range: Number of Calibration materials: Frequency:
Initial Mass: For data analysis - it is not uncommon for initial mass to be not-equal-to the first time/line of mass data in .csv files (e.g., due to taring, or buoyancy effects). Currently, we don't have an automated process to converge the two. On a case by case basis, I'd adjust as needed; likely, taring/renormalizing .csv data to the listed (if provided) initial mass.
I really like that we have some ability to visualize data so the Plotting section is great, but I'll hold off on comments there until we can sort through the README first
I would not worry about the "None". If you use a form, then whatever you have in the dropdown can be converted to None as needed. None is commonly used in Python script arguments, so it is handy that way.
@TristanHehnen I'd say press forward with your processing scripts. We are in a similar situation on the gas phase where really only I know how the scripts work. To some degree, this is unavoidable. The fact that you are taking charge and making things happen means you are in control of this aspect of the project. It is welcome from my point of view.
@leventon I would argue that "ideally" measurement data would come from a pull request to GitHub. In lieu of that, emailing a zip that we push to GitHub is the best option. Usually I have to massage the column headers, etc.
But let's give the form idea a try just amongst ourselves. Create a simple toy form and send it to me and Tristan and we can build from there.
Sounds good. Will do.
On Thu, Jun 18, 2020 at 1:43 PM Randy McDermott notifications@github.com wrote:
I would not worry about the "None". If you use a form, then whatever you have in the dropdown can be converted to None as needed. None is commonly used in Python script arguments, so it is handy that way.
@TristanHehnen https://github.com/TristanHehnen I'd say press forward with your processing scripts. We are in a similar situation on the gas phase where really only I know how the scripts work. To some degree, this is unavoidable. The fact that you are taking charge and making things happen means you are in control of this aspect of the project. It is welcome from my point of view.
@leventon https://github.com/leventon I would argue that "ideally" measurement data would come from a pull request to GitHub. In lieu of that, emailing a zip that we push to GitHub is the best option. Usually I have to massage the column headers, etc.
But let's give the form idea a try just amongst ourselves. Create a simple toy form and send it to me and Tristan and we can build from there.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MaCFP/matl-db/issues/27#issuecomment-646211127, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADBUSGMRVPNDIRYATCJNVJLRXJGZ7ANCNFSM4LGQKNUQ .
Isaac Leventon, PhD National Institute of Standards and Technology Fire Research Division Building 224 - Room A265 Gaithersburg, MD 20899 (301) 975-6887
Thank you @leventon and @rmcdermo for your time and comments!
@leventon :
For the next steps, I would like to wrap up the TGA experiment processing functionality, using the UMET data set as example. When we have agreed that this is how it should look like, I would propagate the necessary changes to all other TGA experiments within the repo. Afterwards, adding the functionality for the next experiment with a single example case, e.g. Cone Calorimeter, have this discussed, propagate the changes and so forth.
I like the Wiki, it’s a good addition to start building/adding reference material there. As for the hope that we’ll get people to submit data through Github PRs vs. email – long term, I hope we get there, but we just haven’t seen any willingness from our participants yet for that. It’s a learning curve / barrier to entry that we likely won’t get past with all participants. I am no longer wholly incompetent with Github but it took a surprising amount of effort to get to this level (just for adding/editing of data / files in our repo). I’m not sure I would (and it looks like most contributors wouldn’t either) want to go through that just to submit files. Though it may exist elsewhere, writing up the most basic, step-by-step walkthrough of how to do that in Github (including how to set up an account, close the repo..) and posting that to the wiki would help, especially for new users (even if we just copy it from elsewhere)
As for the TGA/DSC template that’s there, I’m still not sure how well it will work out if we rely on that vs. trying to create a form that needs to be filled out in a certain way. I mention that because we had already included that info in the guidelines that were emailed to everyone AND templates were available on the Repo when most labs submitted data but we still got quite a spread in what was submitted. So long as they can edit fields (e.g., [none]) when they write their own files, we’ll likely get a lot of variation (not all labs, but most). You mentioned different templates – as I started thinking about a google form for that, I came to same conclusion, we likely need multiple options to record info from all the test types people are submitting. If you’re okay with unique python scripts for each one, we should be okay.
As a trial run for that vs. just following the templates you provide for TGA data – using the Chile group as a test case might be a good idea. I suspect they’ll submit cone and TGA data. How would you feel about getting your template up and available on the repo and making a google form as a second option. Hopefully they can provide feedback for which was easier to work with. Til then, fields like calibration types that need (or would do well from) having suggested items likely can be best defined with a drop down menu in a form.
TGA naming- I’m right there with you on those file names getting long / out of hand. In fact, heating rate wasn’t even included on some of the earlier data sets. As I started analyzing that data though, it became apparent that heating rate and gaseous environment were needed (or at least very helpful) to include.
TGA initial masses: What’s happening here is likely how the experiment is run. From my experience with the test, you have a range of options for defining that mass (m0). In a number of cases, separately measuring m0 before you start your test gives you the most accurate measurement. The balance is hypersensitive and so you can see shifts in that signal at the start /end of the test. Let’s say true mass is 5.0 mg. It’s not uncommon for the initial steady state TGA mass (at 20C-80C) to read higher or lower (though be stable). In that case, I’d use the time resolved mass loss but renormalize the initial mass to match m0 as measured independently. Steps like that.. they’re clear to the experimentalist (it’s why UMD submitted their own averaging / uncertainty analysis) but it can be hard to automate. *This is something that will require further discussion
I think I shared with you (email) a copy of the outline/next steps that was sent to the condensed phase committee a couple weeks ago. In ~2 weeks from now, I’ll need to prepare a summary of data to the participants. That will include preliminary analysis of all test data (hopefully, I’m working through that in MATLAB now). When that report is shared with the committee, and then with participants, we are requesting feedback on how we want to do that analysis (e.g., how to define smoothing, test averaging, uncertainty analysis, key data point identification…) There is not necessarily one best approach; one of our goals was to come to a consensus as a community on how to do that. Because those requirements will evolve, it may be worth waiting a week or two on your end to write/finalize those analysis scripts in python until we agree on how we want to do that analysis (how we format the data may even need to change; e.g., time/temperature resolution). For now, plotting tools like you have for the TGA data – those are great for visualization and likely won’t have to change as much, so they may be better to focus on in the near term. When I share the report/summary approach with the committee (a week before it’s widely shared with the community/participants) I would like your feedback though, if you can. The exact code between matlab/python will change, but the functionality in the end will be the same.
Of all files to work with for TGA – please avoid UMET for now. That set is messed up. I’m aware of some challenges; I have different notes on what I want to do there, and I’ll edit it eventually when I can but.. just for now, please choose a friendlier set. I think SANDIA TGA data was okay (and that gives you a range of test conditions to play with too).
Oops. Please forgive the formatting of that last message. Larger font is not meant to indicate emphasis, I don't know what happened there.
Markdown thinks the -----
mean you are formatting a table and it makes the column headings of tables bold. (Part of the learning curve :)
As usual, I disagree with the comment about automation. These things are not difficult. Just get the data into a simple column format and we can do pretty much anything.
Hello everyone, my apologies for the radio silence recently.
I've now implemented some improvements for the processing of the README files. The individual steps are now better implemented into functions. These functions contain inline comments and docstrings, in an effort to make things more accessible for users, or rather developers.
For developers and maintainers:
There is a Jupyter notebook ExpInfoConstruction
that details how the README files are processed. This is meant to explain the process to developers that would like to contribute to the utilities. Furthermore, it is used to create the dictionary and save it to a Python file, such that maintainers of the repository could easily update the dictionary when new data comes in.
The creation of the Python file is not meant to be performed often, but only for updates by the maintainers, or possibly contributers, for when new data is added.
For users: For regular use the Python file is to be imported and then all the information is readily accessible. They should ideally not need to deal with the things mentioned above.
Now, the question is if the layout/format of the README files, at least for TGA experiments, is settled (e.g. my proposal in the UMET README in my fork). Then I would have another pass over the implemented functions to ensure they work with said format and unify the remaining README files. Afterwards I would start processing the cone calorimeter data, for example. I will also update the demonstration of the usage of the dictionary that is meant for the users.
What are your thoughts about this?
@TristanHehnen I am very much in favor of moving forward with your Python scripts. I have just spent the last few days going through the current Matlab scripts and, while these were necessary to get started, they need an overhaul.
What would be very helpful, and I am not sure how far you are from having this, is if you could create a master Python script that would process the exp data and create all the plots needed for Isaac's document.
Isaac is going to email me his personal copy and then I will push the pdf up to the Releases page. You can then use that document as a basis for your scripts. If that document is not sufficient, then I think it means it needs work. So, this will be an excellent exercise.
@rmcdermo
The overall goal with the above discussed dictionary is indeed to facilitate the automatic processing of all the information within the matl_db
repo. However, said scripts are merely the foundation, aiming to structure all the information and make it easier accessible (at least for people using Python).
For now the "master Python script" is not really feasible, because the remaining README files need to be adjusted and the only experiments that are accessible yet are the TGA experiments. These are the next steps I'm working on, mentioned above.
I can certainly help translating the Matlab functionality into Python scripts. However, if it is not too urgent, I would like to focus first on the foundation - processing all the README files.
For translating the Matlab functionalities I would open a new issue to keep both tasks clearly sperate.
The translation of the Matlab functionalities to Python have now their own issue, see issue #80
@leventon I will now adjust the TGA/DCS parts of the README files, primarily the test summary tables. Such that these tables contain the O2 concentration. I will also remove the file name column from my UMET example, because I believe the consensus was that the file names should be identical to the test names. For cases where TGA and DCS were conducted simultaniously, I will set up a function that adjusts the [...]STA[...] in the test name to the respective test when reading the file names. Even though, both tests could be performed simultaniously, I would keep them seperate within the dictionary, primarily to deal with cases where only one of each was conducted. Also, this information will not be lost anyway.
So, just as a head-up: the TGA data can now be processed and the respective information is already stored in the Python file containing the dictionary. There is a brief demonstration notebook, that plots the TGA results from the institutes that submitted test data with a 20 K/min heating rate, just as an example.
I would now proceed to the cone calorimeter.
EDIT: Typo
Looks great, thanks!
Update: DSC data can now be processed. Construction notebook and dictionary are updated accordingly.
EDIT: DSC README template added as well.
Hi @leventon and @rmcdermo,
I've now unified most of the README files concerning the cone calorimeter data. Based on this I've created a template.
I would like to ask you to check said template for consistency and completeness. Specifically look at the sample holder and retainer frame dimensions. Across various README files values for both were provided and are thus replicated in the template. Main questions here are:
Furthermore, there are some significant differences on the volume around sample and heater (sample chamber). Some apparatuses have some kind of box around them (glass walls at the sides), while others can seal this part off and control the atmosphere. I'm not sure how to deal with this and I've just provided a relatively basic approach to collect this data. Would this be sufficient or am I missing something here? Would we need some flow rates here as well, specifically for the controlled atmosphere ones?
For the backing my idea is to adress each material as an individual layer. The provided lines would need to be copied and the individual entries numbered accordingly.
With the thermocouples there are different ways their locations are reported. Some are marked "front" and some "back". I'm thinking now, that there could be two coordinate systems, one starting at the centre of the front face of the sample and the other on the top face of the backing (back of the sample). Then negative z-values would denote locations within the sample and backing respectively - positive z-values point towards the heater for both systems. Furthermore, it might be interesting to address the directions from which the individual thermocouples are lead to the measurement location. There could be some point like "Lead from: left side", or something.
The summary table could be extended to get a column for the flame out time and a column for the residual mass after the test.
Tristan
Why not also require a detailed drawing of the system? Modelers usually need this sort of thing.
Hi Tristan,
There's a lot here it looks comprehensive. Good work.
I added a new README that lists some of the most important features of tests/setup that we'd want to know (prepared w/ committee input, following a call 1-2 weeks ago). I think your template is pretty exhaustive but please review that to see if anything is missing (e.g., baseline corrections, calibration information): https://github.com/MaCFP/matl-db/tree/master/Non-charring/PMMA
In general, I'd try to avoid repeat data (e.g., heat fluxes or sample mass at top of list are also in the Table at end) The table should be the one source for data that varies between tests (specific values that you might read in programmatically), above should be general info to someone browsing the data. Could we add a descriptor in the above section noted these are more general values (e.g., Heat flux(es): 25-65 kW/m2 || Initial Sample Mass ~70 g)
Frame/holder info: Key is exposed sample surface area Similarly, other sample area/mass values should be noted as 'initial'
Backing materials Some of these values will be temp-dependent. It's not clear to me the best way to report that is. At the very least, we'll need a field for "temperature at which that property is evaluated at, if not the ability to list multiple". For completeness/good modeling, this info is key. We're asking for a lot here so either a link to a source or a separate formatted text file with this info might be needed.
Thermocouples Bead diameter shouldn't really affect measurements, we're not in the gase phase. Do we need this (given how much other info is here)?
Instrument
On Thu, Oct 29, 2020 at 9:33 AM Randy McDermott notifications@github.com wrote:
Why not also require a detailed drawing of the system? Modelers usually need this sort of thing.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MaCFP/matl-db/issues/27#issuecomment-718755090, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADBUSGPRYHAXRNQJISU3NGTSNFVJVANCNFSM4LGQKNUQ .
Isaac Leventon, PhD National Institute of Standards and Technology Fire Research Division Building 224 - Room A265 Gaithersburg, MD 20899 (301) 975-6887
Hi @leventon !
First off: Yes, there are a lot of items in the template. I would like to emphasise that they are all collected from the README files that have been provided by the institutes. I only added two minor details the ignition time column and the bead diameter.
From the points in the README you linked to, I seem to have most of it covered. The main thing missing in the template is the baseline corrections. I've talked to some of our experimenters and they mentioned "correction curves" specifically for TGA/DSC type of apparatuses. Would this be similar? If we would like to have this data, I suggest to provide them as *.csv files. The file could be labeled InstituteLabel_BaselineCorrections.csv
. The column labels could be the individual test labels the baseline correction correspond to.
Sure, we can use the bullet points up top only for the general info of heat flux and initial sample mass.
Okay, I propose then that I reduce the bullet points for the whole sample-holder-thing to this:
I put the rest into the description part in the beginning, to keep the information on the sample holder that was already provided by the institutes.
The data of the backing materials could be stored in another *.csv file (my favourite) or in another table in the README. File name should then be something like InstituteLabel_GlassWool.csv
or InstituteLabel_Backing1.csv
.
I'll remove the bead diameter.
I'll change the calibration to:
I'll adjust the apparatus type.
To 3.: I move the nominal exposed surface and the diameter/edge length to the sample itself which reduces the amount of items for the retainer frame even more.
Hi @TristanHehnen , lots of good work here, thanks for the update.
We should confirm the group's calibration procedure (type / frequency / matls) and whether or not results have been corrected for drifts in their baseline (TGA, DSC, MCC, and Cone HRR are all often adjusted in such a way). This calibration and baseline correction should be done by the experimentalist, not the modeler, and the process described (not asked to be reproduced).
Are these the same as a "correction curve"? Maybe, but that's ambiguous wording to me. For clarity, I'd refer you to each of the reference texts suggested in the preliminary summary (they discuss the principles/practices needed) but that's not most supportive of replies. A more friendly, immediately useful response might be to have a ~30 min call when we can go through each of these types of corrections/calibrations // setup/processing steps rather than trade messages.
works for me. For shape, simply offering [None/square/round] may provide consistency, though I suspect we're doing all that ourselves / manually anyway.
This is a more interesting questions. A standard format for what properties we want / how that should be submitted would be needed. Such info is not necessarily given by our labs. Some discussion with them may help. Going through the effort of creating this nice standard template, requesting all the files should be balance by what info they can and will provide.
Small clarification: The thermocouple diameter was introduced by Edinburgh and I changed it bead diameter.
Got it, thanks.
Can we keep that one as a "note" but not request it of all labs?
On Fri, Nov 6, 2020, 05:05 FireTristan notifications@github.com wrote:
Small clarification: The thermocouple diameter was introduced by Edinburgh https://github.com/MaCFP/matl-db/tree/master/Non-charring/PMMA/Edinburgh and I changed it bead diameter.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MaCFP/matl-db/issues/27#issuecomment-722993500, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADBUSGJNABXJP53KE2Q2BQDSOPC6RANCNFSM4LGQKNUQ .
Sure.
As mentioned in #26, here is the link to the first iteration of the
ExperimentalDataInfo.py
. It contains basically the information that is provided via theREADME.md
files and knows the location of the CSV files containing the data from the different experiments. I like this approach, because it allows me to access all the information from within python scripts or Jupyter notebooks. The human-readable keys to access the different items I find to reduce errors, as compared. Also dictionaries can be easily transformed into Pandas DataFrames which allows for nice rendering of tables in the Jupyter notebooks. Furthermore, I can easily pass the information on to scripts that build FDS and optimisation input files.It could be located in the root directory of the MaCFP Git repo (obviously file paths need to be adjusted).
Since it aims to mirror the structure of the
README.md
files it might be relatively simple to set up scripts to automatically screen the repository and add information of new data sets.If this script is considered a useful addition to the MaCFP project we can add it in.