Closed f-hafner closed 4 months ago
@Lsage , can you tell me which scripts, and any other data, you would like to be exported?
our goal is to track all code in git repositories outside the RA. this should speed up development and ease collaboration. for this, we want to export all existing codes that are not yet in this repository. we also want to export some summary statistics of the input data to our models so that we have something to work with.
So, you should tell me by Friday which scripts I should export, and give me code to create the summary statistics on the input data of your models (or create the summary statistics yourself and tell me where to find them for the export)
Ok. It depends on whether we are interested in the scripts that reformat the original CBS data or only the modeling part
@Lsage I think it would be good to export both.
@f-hafner I edited the description of the issue
I agree with Tanzir that ideally we export both. There's actually some work by another group on re-creating fake CBS records. It's a private repo to which I have access; @Lsage if you have time today after the meeting we can look at this and see if we can re-use/expand that
@f-hafner yes ok let's have a look after the meeting
@Lsage , I created a private fork of the repo we discussed today: https://github.com/odissei-lifecourse/cbs_validationdata
Have a look; if it's too much work to create fake data also for all original files, we can focus on the intermediate ones.
@f-hafner This looks very interesting. Can we generate synthetic version of all CBS files on our local computer using these scripts without access to actual CBS data? If not, can you tell me what these can do?
I added some docs for exports: https://github.com/odissei-lifecourse/life-sequencing-dutch/blob/main/docs/ra_export.md @tanzir5 , the format of your exports needs to be adjusted to these constraints. @Lsage , we should sit together when making the export to fill the Standardformulier Export correctly
Done @f-hafner
Update: current export was not approved
To be fixed:
work_env
? -> sample for all files, but count total number of records in the filewe don't have code for running the network stats on work env, and it's not a priority at the moment. closing.
To include