Automated Extraction of Data for Australia - @Ecaloota has writing up a post about the automated extraction of data from the ARTG (Australian Register ofTherapeutic Goods) #11. Approval dates were in the PDFs but they are not consistent over time so were collected manually. Information that couldn’t be extracted included that on patent and exclusivity agreements, the current market status and if those listed are generics.
@kym834 - For patents is they are worldwide patents might be able to find out if they are still under patent by checking in with our US data.
@kym834 - Add in whether or not it has been approved with a simple yes/no to clarify that the information no available as it is not approved rather than just missing.
Other sources to mine from
@kym834 - Found no other sources. If anyone finds some please let us know.
Stage II key events and comparisons between US and Aus
No progress on these action items yet. Open for volunteers to join and help with these tasks. Can we automate this in some way?
@yaelago to post on GitHub and @Ecaloota will look into this more deeply.
Publishing opportunities
Keep eyes open. Put up a blog on the Breaking Good website on National Science Week after it is all done and dusted.
2. National Science Week
We received a grant from Inspiring Australia to help fund an event during National Science Week. It will include an event on Aug 18 (1 hr) containing mini talks and a panel discussion followed by a mini hackathon. This will be hosted on Zoom. On the Aug 20 there will be a 12 hour hackathon. We hope to run an activity that is a combo of Stages I and II that we did for the USA but for Australia where the information cannot be automated. Again, hosted on Zoom with team members taking shifts to greet/help community members. We have also planned to run a few school focused sessions if there is some interest.
3. Next Steps
Where automation is possible, who can help, what additional resources or assistance is needed? - Automation is dependant on the data source. Where we have a data base but automation isn’t possible best place to start is with the people who manage it. Who can help will also depend on what needs to be done.
**Important to note that we are only interested in publicly available information and that we are trying to collect data from websites where the data is reliable.
Where automation is not possible, what is the best method for data collection? - Reflecting on what we’ve done so far, we’ve had success with very scaffolded work but not so much when more freedom is given. In future we should consider Give a set period of time when asking. Spontaneous and opportunistic ability to be able to share information instantly with no boundaries. Never know what will pop up.
@alintheopen - What would motivate people to take the time out of their normal routine to do this?
@Ecaloota - like that ACSA and DigiVol give you a profile and you can see what you’ve contributed to and count your participation or send emails that explain how your input has helped to progress that project.
@alintheopen - can pull activity from GitHub and have it displayed as a ‘leader board’ on a website. Could be an option for us.
4. AOB
We are putting an expression of interest form the Australian Research Data Comms (ARDC) National Data Partnerships program due July 31. EOI are shared publicly on their website.
5. Action Items
[ ] ACTION ITEM: @yaelago to post on GitHub and @Ecaloota will look into this more deeply.
[ ] ACTION ITEM: All - think more creatively about how we can implement some of @Ecaloota comments about acknowledging participation.
[ ] ACTION ITEM: All - think about how can make the next stages more engaging and interesting for people who are giving their time towards the project.
6. Next meeting
After National Science Week, date and time TBD.
This meeting follows on from the previous meeting (minutes 👉 #10)
When: Tuesday 27th July 2020 at 9 pm AEST (Sydney, Australia)
Where: Held through Zoom
Who: @fantasy121, @kym834, @alintheopen, @yaelago, @Ecaloota, Peter Rutledge
You can view the recording of the meeting here
Minutes:
1. Update on last meetings action items
Automated Extraction of Data for Australia - @Ecaloota has writing up a post about the automated extraction of data from the ARTG (Australian Register ofTherapeutic Goods) #11. Approval dates were in the PDFs but they are not consistent over time so were collected manually. Information that couldn’t be extracted included that on patent and exclusivity agreements, the current market status and if those listed are generics.
@kym834 - For patents is they are worldwide patents might be able to find out if they are still under patent by checking in with our US data.
@kym834 - Add in whether or not it has been approved with a simple yes/no to clarify that the information no available as it is not approved rather than just missing.
Other sources to mine from @kym834 - Found no other sources. If anyone finds some please let us know.
Stage II key events and comparisons between US and Aus No progress on these action items yet. Open for volunteers to join and help with these tasks. Can we automate this in some way? @yaelago to post on GitHub and @Ecaloota will look into this more deeply.
Publishing opportunities Keep eyes open. Put up a blog on the Breaking Good website on National Science Week after it is all done and dusted.
2. National Science Week
We received a grant from Inspiring Australia to help fund an event during National Science Week. It will include an event on Aug 18 (1 hr) containing mini talks and a panel discussion followed by a mini hackathon. This will be hosted on Zoom. On the Aug 20 there will be a 12 hour hackathon. We hope to run an activity that is a combo of Stages I and II that we did for the USA but for Australia where the information cannot be automated. Again, hosted on Zoom with team members taking shifts to greet/help community members. We have also planned to run a few school focused sessions if there is some interest.
3. Next Steps
Where automation is possible, who can help, what additional resources or assistance is needed? - Automation is dependant on the data source. Where we have a data base but automation isn’t possible best place to start is with the people who manage it. Who can help will also depend on what needs to be done.
**Important to note that we are only interested in publicly available information and that we are trying to collect data from websites where the data is reliable.
Where automation is not possible, what is the best method for data collection? - Reflecting on what we’ve done so far, we’ve had success with very scaffolded work but not so much when more freedom is given. In future we should consider Give a set period of time when asking. Spontaneous and opportunistic ability to be able to share information instantly with no boundaries. Never know what will pop up.
@alintheopen - What would motivate people to take the time out of their normal routine to do this?
@Ecaloota - like that ACSA and DigiVol give you a profile and you can see what you’ve contributed to and count your participation or send emails that explain how your input has helped to progress that project.
@alintheopen - can pull activity from GitHub and have it displayed as a ‘leader board’ on a website. Could be an option for us.
4. AOB
5. Action Items
6. Next meeting After National Science Week, date and time TBD.