Closed adam-does-code closed 3 years ago
Hi,
It is possible to combine multiple matches using joinmatches function, for instance this template:
<vars>
descr_chain = [
"PHRASE",
"exclude('Prerequisite(s)')",
"exclude('Department(s)')",
"joinmatches"
]
</vars>
<group>
{{ course }}*{{ code }} {{ name | PHRASE }} {{ semester }} ({{ lecture_lab_time }}) [{{ weight }}]
{{ description | chain(descr_chain) }}
Prerequisite(s): {{ prereqs | ORPHRASE }}
Department(s): {{ department | ORPHRASE }}
</group>
For this data:
ECON*3400 The Economics of Personnel Management U (3-0) [0.50]
In this course, we examine the economics of personnel management in organizations.
Using mainstream microeconomic and behavioural economic theory, we will consider
such issues as recruitment, promotion, financial and non-financial incentives,
compensation, job performance, performance evaluation, and investment in personnel.
The interplay between theoretical models and empirical evidence will be emphasized in
considering different approaches to the management of personnel.
Prerequisite(s): ECON*2310 or ECON*2200
Department(s): Department of Economics and Finance
ECON*4400 The Economics of Personnel Management U (7-1) [0.90]
In this course, we examine the economics of personnel management in organizations.
Using mainstream microeconomic and behavioural economic theory, we will consider
such issues as recruitment, promotion, financial and non-financial incentives,
compensation, job performance, performance evaluation, and investment in personnel.
Prerequisite(s): ECON*2310
Department(s): Department of Economics
would produce:
[[[{'code': '3400',
'course': 'ECON',
'department': 'Department of Economics and Finance',
'description': 'In this course, we examine the economics of personnel management in organizations.\n'
'Using mainstream microeconomic and behavioural economic theory, we will consider\n'
'such issues as recruitment, promotion, financial and non-financial incentives,\n'
'compensation, job performance, performance evaluation, and investment in personnel.\n'
'The interplay between theoretical models and empirical evidence will be emphasized in\n'
'considering different approaches to the management of personnel.',
'lecture_lab_time': '3-0',
'name': 'The Economics of Personnel Management',
'prereqs': 'ECON*2310 or ECON*2200',
'semester': 'U',
'weight': '0.50'},
{'code': '4400',
'course': 'ECON',
'department': 'Department of Economics',
'description': 'In this course, we examine the economics of personnel management in organizations.\n'
'Using mainstream microeconomic and behavioural economic theory, we will consider\n'
'such issues as recruitment, promotion, financial and non-financial incentives,\n'
'compensation, job performance, performance evaluation, and investment in personnel.',
'lecture_lab_time': '7-1',
'name': 'The Economics of Personnel Management',
'prereqs': 'ECON*2310',
'semester': 'U',
'weight': '0.90'}]]]
How it works:
{{ description | chain(descr_chain) }}
- descr_chain
reference variable that is a list of functions to pass matches through, as you might have more patterns to exclude for description
variable, it might be easier to define them through chainexclude('Prerequisite(s)')
- you need it because PHRASE
pattern for description
variable and {{ prereqs | ORPHRASE }}
variable matches same lines and TTP will select first match which correspond to description
variable, as a result need to explicitly filter matches for description
variable using exclude
function.joinmatches
- by default will use "\n" to join matched results, can use other symbols, or you can use to_list
to transform matches in a list and use join matches to combine them in list.Thanks so much for the reply!
Could I do the same thing if i had a text that looked like, i've been trying but haven't had much success:
IBIO*4521 Thesis in Integrative Biology F (0-12) [1.00]
This course is the first part of the two-semester course IBIO*4521/2. This course is
a two-semester (F,W) undergraduate project in which students conduct a comprehensive,
independent research project in organismal biology under the supervision of a faculty
member in the Department of Integrative Biology. Projects involve a thorough literature
review, a research proposal, original research communicated in oral and poster
presentations, and in a written, publication quality document. This two-semester course
offers students the opportunity to pursue research questions and experimental designs
that cannot be completed in the single semester research courses. Students must make
arrangements with both a faculty supervisor and the course coordinator at least one
semester in advance. A departmental registration form must be obtained from the course
coordinator and submitted no later than the second class day of the fall semester. This is
a twosemester course offered over consecutive semesters F-W. When you select this
course, you must select IBIO*4521 in the Fall semester and IBIO*4522 in the Winter
semester.A grade will not be assigned to IBIO*4521 until IBIO*4522 has been completed.
Prerequisite(s): 12.00 credits
Restriction(s): Normally a minimum cumulative average of 70%. Permission of course
coordinator.
Department(s): Department of Integrative Biology
For the restrictions, it goes onto multiple lines
Well, surprisingly, after experimenting a bit came out with this:
from ttp import ttp
import pprint
data = """
IBIO*4521 Thesis in Integrative Biology F (0-12) [1.00]
This course is the first part of the two-semester course IBIO*4521/2. This course is
a two-semester (F,W) undergraduate project in which students conduct a comprehensive,
independent research project in organismal biology under the supervision of a faculty
member in the Department of Integrative Biology. Projects involve a thorough literature
review, a research proposal, original research communicated in oral and poster
presentations, and in a written, publication quality document. This two-semester course
offers students the opportunity to pursue research questions and experimental designs
that cannot be completed in the single semester research courses. Students must make
arrangements with both a faculty supervisor and the course coordinator at least one
semester in advance. A departmental registration form must be obtained from the course
coordinator and submitted no later than the second class day of the fall semester. This is
a twosemester course offered over consecutive semesters F-W. When you select this
course, you must select IBIO*4521 in the Fall semester and IBIO*4522 in the Winter
semester.A grade will not be assigned to IBIO*4521 until IBIO*4522 has been completed.
Prerequisite(s): 12.00 credits
Restriction(s): Normally a minimum cumulative average of 70%. Permission of course
coordinator.
Department(s): Department of Integrative Biology
IBIO*4533 Thesis in Integrative Biology F (0-14) [2.00]
This course is the first part of the two-semester course IBIO*4521/2. This course is
a two-semester (F,W) undergraduate project in which students conduct a comprehensive,
independent research project in organismal biology under the supervision of a faculty
member in the Department of Integrative Biology.
Restriction(s): Normally a minimum cumulative average of 80%. Permission of course
coordinator. Normally a minimum cumulative average of 90%. Permission of course
coordinator.
Department(s): Department of Integrative Biology
"""
template = """
<vars>
chain_1 = [
"ORPHRASE",
"exclude('Prerequisite(s)')",
"exclude('Department(s)')",
"exclude('Restriction(s)')",
"joinmatches"
]
</vars>
<group>
{{ course }}*{{ code }} {{ name | PHRASE }} {{ semester }} ({{ lecture_lab_time }}) [{{ weight }}]
{{ description | chain(chain_1) }}
Prerequisite(s): {{ prereqs | ORPHRASE }}
Department(s): {{ department | ORPHRASE }}
<group name="_">
Restriction(s): {{ restrictions | PHRASE | joinmatches }}
{{ restrictions | chain(chain_1) }}
</group>
</group>
"""
parser = ttp(data=data, template=template, log_level="ERROR")
parser.parse()
res = parser.result()
pprint.pprint(res, width=150)
# prints:
#
# [[[{'code': '4521',
# 'course': 'IBIO',
# 'department': 'Department of Integrative Biology',
# 'description': 'This course is the first part of the two-semester course IBIO*4521/2. This course is\n'
# 'a two-semester (F,W) undergraduate project in which students conduct a comprehensive,\n'
# 'independent research project in organismal biology under the supervision of a faculty\n'
# 'member in the Department of Integrative Biology. Projects involve a thorough literature\n'
# 'review, a research proposal, original research communicated in oral and poster\n'
# 'presentations, and in a written, publication quality document. This two-semester course\n'
# 'offers students the opportunity to pursue research questions and experimental designs\n'
# 'that cannot be completed in the single semester research courses. Students must make\n'
# 'arrangements with both a faculty supervisor and the course coordinator at least one\n'
# 'semester in advance. A departmental registration form must be obtained from the course\n'
# 'coordinator and submitted no later than the second class day of the fall semester. This is\n'
# 'a twosemester course offered over consecutive semesters F-W. When you select this\n'
# 'course, you must select IBIO*4521 in the Fall semester and IBIO*4522 in the Winter\n'
# 'semester.A grade will not be assigned to IBIO*4521 until IBIO*4522 has been completed.',
# 'lecture_lab_time': '0-12',
# 'name': 'Thesis in Integrative Biology',
# 'prereqs': '12.00 credits',
# 'restrictions': 'Normally a minimum cumulative average of 70%. Permission of course\ncoordinator.',
# 'semester': 'F',
# 'weight': '1.00'},
# {'code': '4533',
# 'course': 'IBIO',
# 'department': 'Department of Integrative Biology',
# 'description': 'This course is the first part of the two-semester course IBIO*4521/2. This course is\n'
# 'a two-semester (F,W) undergraduate project in which students conduct a comprehensive,\n'
# 'independent research project in organismal biology under the supervision of a faculty\n'
# 'member in the Department of Integrative Biology.',
# 'lecture_lab_time': '0-14',
# 'name': 'Thesis in Integrative Biology',
# 'restrictions': 'Normally a minimum cumulative average of 80%. Permission of course\n'
# 'coordinator. Normally a minimum cumulative average of 90%. Permission of course\n'
# 'coordinator.',
# 'semester': 'F',
# 'weight': '2.00'}]]]
Looks more or less like what you need, but test/verify it on your dataset before using it at scale.
How it works:
chain_1
- updated chain definition to exclude one more patternname="_"
- uses null path feature to flatten results a bitClosing, let me know if any further help needed.
Closing, let me know if any further help needed.
Hi !
I'm trying to parse multi-line paragraph and I havent been able to figure it out. I was wondering If you could help confirm if ttp can handle a template such as this:
An example text:
Currently my template looks like: